Performance is a feature. For many Google applications it is the feature that makes everything else possible—instant text and voice search, directions, translations, and more. The platforms and infrastructure teams at Google are always on the leading edge of development and deployment of new performance best practices. These best practices, in turn, benefit your applications running on Google Cloud Platform.

One great example of Cloud Platform leveraging technology improvements in the Google infrastructure is support for SPDY and HTTP/2. The Google Chrome team began experimenting with SPDY back in 2009 and Google servers were the first to support it. SPDY proved to deliver significant performance benefits and helped shape the development of the HTTP/2 protocol. Now, with HTTP/2 finalized (RFC 7540, RFC 7541) and rapidly gaining client adoption, Google servers are, once again, leading the way by being among the first to offer full HTTP/2 support.

HTTP/2 on Google Cloud Platform
We’re not breaking any news here, because Google Cloud Platform support for SPDY has been enabled for many years, and support for its successor (HTTP/2) was enabled earlier this year. Many performance-conscious developers are already leveraging the performance benefits of the HTTP/2 protocol in their applications, and so should you! Here’s what you need to know:

  1. Your application must run over HTTPS. Support for the HTTP/2 protocol is automatically negotiated as part of the TLS handshake via ALPN.
  2. The server terminating the secure HTTPS connection must support ALPN and HTTP/2 protocols. If either side does not support the new HTTP/2 protocol then they fallback to HTTP/1.1.

If your application is already running over HTTPS, and the secure session is terminated by one of Google servers, then—good news—you’re already HTTP/2 enabled! Let’s take a peek under the hood to see how this works.

HTTP-2 Flowchart - Remove Server Push (v3).png

Google Cloud Storage
Consider a scenario in which you upload a file into Google Cloud Storage and get an HTTPS link for the resource, which you then reference or share with others. When the resource is fetched, the client and Google server negotiate the TLS session and HTTP/2 protocol support. If the client supports HTTP/2 then the Google servers automatically selects it; otherwise the Google servers fallback to HTTP/1.1. In short, just upload a file and provide an HTTPS link to it and your job is done.

Google App Engine
Both in its original incarnation and Managed VM configurations, Google App Engine servers are responsible for negotiating and terminating the TLS tunnel. As long as your application is being served over HTTPS, the Google servers automatically negotiate HTTP/2 on your behalf.

Each HTTP/2 stream is then translated into an HTTP/1.1 request and routed to your application. As a result, no modifications are required within your application to enable HTTP/2. Simply direct your traffic through HTTPS and all capable visitors will be automatically served over HTTP/2.

Google Compute Engine and Google Container Engine
If you’re running a virtual machine, or an entire cluster, via Google Compute Engine or Google Container Engine, then you have multiple options for where and how the inbound connections are routed:

  1. You can expose the virtual instance directly, in which case you must set up and manage the necessary infrastructure to terminate the TLS tunnel, negotiate HTTP/2, and process the HTTP/2 session—see the HTTP/2 wiki for a list of open source servers.
  2. You can use a network load balancer to distribute inbound TCP connections to one of multiple servers, which then have to terminate the TLS tunnel and negotiate HTTP/2—the same as with exposing the virtual instance directly.
  3. You can use an HTTPS load balancer, which terminates the TLS tunnel and negotiate HTTP/2 on your behalf. The load balancer translates inbound HTTP/2 streams into HTTP/1.1 requests and forwards them to one of your servers—same as Google App Engine. This is the quickest and easiest way to enable HTTP/2 for your Compute/Container-powered application.

There shouldn’t be any surprises with any of the above. To enable HTTP/2, you need to enable HTTPS, and then either let one of the Google servers terminate it and do all the heavy lifting on your behalf, or route the connection to one of your HTTP/2 capable servers.

Note: There is much more to be said about the benefits and optimization strategies for HTTP/2it’s a big and important upgradebut that’s outside of scope of this post. If you’re curious, however, check out the free chapters on HTTP/2 and HTTP/2 optimization strategies in High Performance Browser Networking (O’Reilly).

A QUIC peek into the future

The Google Chrome and platform teams continue to innovate by experimenting with QUIC, a UDP-based transport for HTTP/2, which enables faster 0-RTT secure handshakes, improved congestion control, loss recovery, and other mechanisms that promise to further improve performance:

The data shows that 75% percent of connections can take advantage of QUIC’s zero-round-trip feature. QUIC outshines TCP under poor network conditions, shaving a full second off the Google Search page load time for the slowest 1% of connections. These benefits are even more apparent for video services like YouTube. Users report 30% fewer rebuffers when watching videos over QUIC. This means less time spent staring at the spinner and more time watching videos.
                                              - from A QUIC update on Google’s experimental transport

And speaking of being on the leading edge of performance—Many applications powered by App Engine are already speaking QUIC to capable Chrome clients, and we’re hard at work to bring QUIC to all other products as well. As with HTTP/2, the Google servers terminate the UDP flow and translate the QUIC protocol to HTTP/1.1 requests, enabling a transparent performance upgrade to millions of existing applications running on Google Cloud Platform.

Tip: To quickly and easily see what protocol is negotiated, install the HTTP/2 indicator extension for Google Chrome. A blue bolt indicates HTTP/2; a red bolt indicates QUIC.

- Posted by Mark Mandel, Google Cloud Platform Developer Advocate and Ilya Grigorik, Internet Plumber.

Even the most careful developer will make the occasional mistake, and when it comes to security, mistakes can be disastrous. One of our goals is to make it easier for you to develop secure web applications and to find and fix issues early in the development lifecycle.

Today, we are pleased to announce the general availability of Cloud Security Scanner, a tool which enables App Engine developers to proactively test their applications for many common web application security vulnerabilities. For example, it can detect issues like cross-site scripting (XSS), Mixed Content, and Flash Injection or alert you to the usage of insecure Javascript libraries.

The tool is easy to setup and use, and is well suited for the modern, complex, Javascript-heavy applications which App Engine enables you to build and deliver.

Cloud Security Scanner is available free of charge for Google Cloud Platform customers, so please visit to get started.

We’d also like to thank all of the beta testers who have provided great feedback to the product team over the past couple of months. We really appreciate the support.

- Posted by Matthew O’Connor, Product Manager

Cloud native technologies like Kubernetes help you compose scalable services out of a sea of small logical units. In our last post, we introduced Vitess (an open-source project that powers YouTube's main database) as a way of turning MySQL into a scalable Kubernetes application. Our goal was to make scaling your persistent datastore in Kubernetes as simple as scaling stateless app servers - just run a single command to launch more pods. We've made a lot of progress since then (pushing over 2,500 new commits) and we're nearing the first stable version of the new, cloud native Vitess.

Vitess 2.0
In preparation for the stable release, we've begun to publish alpha builds of Vitess v2.0.0. Some highlights of what's new since our earlier post include:

  • Using the final Kubernetes 1.0 API.
  • Official Vitess client libraries in Java, Python, PHP, and Go.
    • Java and Go clients use the new HTTP/2-based gRPC framework.
  • Can now run on top of MySQL 5.6, in addition to MariaDB 10.0.
  • New administrative dashboard built on AngularJS.
  • Built-in backup/restore, designed to plug into blob stores like Google Cloud Storage.
  • GTID-based reparenting for reversible, routine failovers.
  • Simpler schema changes.

We've also been hard at work adding lots more documentation. In particular, the rest of this post will explore one of our new walkthroughs that demonstrates transparent resharding of a live database - that is, changing the number of shards without any code changes or noticeable downtime for the application.

Vitess Sharding
Sharding is bitter medicine, as S. Alex Smith wrote. It complicates your application logic and multiplies your database administration workload. But sharding is especially important when running MySQL in a cloud environment, since a single node can only become so big. Vitess takes care of shard routing logic, so the data-access layer in your application stays simple. It also automates per-shard administrative tasks, helping a small team manage a large fleet.

The preferred sharding strategy in Vitess is what we call range-based shards. You can think of
the shards as being like the buckets of a hash table. We decide which bucket to place a record in based solely on its key, so we don't need a separate table that keeps track of which bucket each key is in.

To make it easy to change the number of buckets, we use consistent hashing. That means instead of using a hash function that maps each key to a bucket number, we use a function that maps each key to a randomly distributed (but consistent) value in a very large set - such as the set of all 8-byte sequences. Then we assign each bucket a range of these values, which we call keyspace IDs.

Transparent Resharding
If you want to follow along with the new resharding walkthrough, you'll need to first bring up the cluster as described in the unsharded guide. Both guides use the same sample app, which is a Guestbook that supports multiple, numbered pages.

In the sample app code, you'll see a get_keyspace_id() function that transforms a given page number to the set of all 8-byte sequences, establishing the mapping we need for consistent hashing. In the unsharded case, these values are stored but not used. When we introduce sharding, page numbers will be evenly distributed (on average) across all the shards we create, allowing the app to scale to support arbitrary amounts of pages.

Before resharding, you'll see a single custom shard named "0" in the Vitess dashboard. This is what an unsharded keyspace looks like.

As you begin the resharding walkthrough, you'll bring up two new shards for the same keyspace. During resharding, the new shards will run alongside the old one, but they'll remain idle (Vitess will not route any app traffic to them) until you're ready to migrate. In the dashboard, you'll see all three shards, but only shard "0" is currently active.

Next, you'll run a few Vitess commands to copy the schema and data from the original shard. The key to live migration is that once the initial snapshot copy is done, Vitess will automatically begin replicating fresh updates on the original shard to the new shards. We call this filtered replication, since it distributes DMLs only to the shards to which they apply. Vitess also includes tools that compare the original and copied data sets, row-by-row, to verify data integrity.

Once you've verified the copy, and filtered replication has caught up to real-time updates, you can run the migrate command, which tells Vitess to atomically shift app traffic from the old shards to the new ones. It does this by disabling writes on the old masters, waiting for the new masters to receive the last events over filtered replication, and then enabling writes on the new masters. Since the process is automated, this typically only causes about a second of write unavailability.

Now you can tear down the old shard, and verify that only the new ones show up in the dashboard.

Note that we never had to tell the app that we were changing from one shard to two. The resharding process was completely transparent to the app, since Vitess automatically reroutes queries on-the-fly as the migration progresses.

At YouTube, we've used Vitess to transparently reshard (both horizontally and vertically) nearly all of our MySQL databases within the last year alone, and we have still more on the horizon as we continue to grow. See the full walkthrough instructions if you want to try it out for yourself.

Scaling Benchmarks
The promise of sharding is that it allows you to scale write throughput linearly by adding more shards, since each shard is actually a separate database. The challenge in achieving that separation while still presenting a simple, unified view to the application is to avoid introducing bottlenecks. To demonstrate this scaling in the cloud, we've integrated the Vitess client with a driver for the Yahoo! Cloud Serving Benchmark (YCSB).

Below you can see preliminary results for scaling write throughput by adding more shards in Vitess running on Google Container Engine. For this benchmark, we pointed YCSB at the load balancer for our Vitess cluster and told it to send a lot of INSERT statements. Vitess took care of routing statements to the various shards.
The max throughput (QPS) for a given number of shards is the point at which round-trip write latency became degraded, which we define as >15ms on average or >50ms for the worst 1% of queries (99th percentile).

We also ran YCSB's "read mostly" workload (95% reads, 5% writes) to show how Vitess can scale read traffic by adding replicas. The max throughput here is the point at which round-trip read latency became degraded, which we define as >5ms on average or >20ms for the worst 1% of queries.
There's still a lot of room to improve the benchmarks (for example, by tuning the performance of MySQL itself). However, these preliminary results show that the returns don't diminish as you scale. And since you're scaling horizontally, you're not limited by the size of a single machine.

With the new cloud native version of Vitess moving towards a stable launch, we invite you to give it a try and let us know what else you'd like to see in the final release. You can reach us either on our discussion forum, or by filing an issue on GitHub. If you'd like to be notified of any updates on Vitess, you can subscribe to our low-frequency announcement list.

- Posted By Anthony Yeh, Software Engineer, YouTube

Cloud Spin, Part 1 and Part 2 introduced the Google Cloud Spin project, an exciting demo built for Google Cloud Platform Next, and how we built the mobile applications that orchestrated 19 Android phones to record simultaneous video. 
And now the last step is to retrieve the videos from each phone, find the frame corresponding to an audio cue in each video, and compile those images into a 180-degree animated GIF. This post explains the design decisions we made for the Cloud Spin back-end processing and how we built it.
The following figure shows a high level view of the back-end design:
1. A mobile app running on each phone uploads the raw video to a Google Cloud Storage bucket.

Video taken by one of the cameras

2. An extractor process running on a Google Compute Engine instance finds and extracts the single frame corresponding to the audio cue.

3. A stitcher process running on an App Engine Managed VM combines the individual frames into a video that pans across an 180-degree view of an instant in time, and then generates a corresponding animated GIF.

How we built the Cloud Spin back-end services
After we’d designed what the back end services would do, there were several challenges to solve as we built them. We had to figure how to:
  • Store large amounts of video, frame, and GIF data
  • Extract the frame in each video that corresponds to the audio cue
  • Merge frames into an animated GIF
  • Make the video processing run quickly
Storing video, frame, and GIF data
We decided the best place to store incoming raw videos, extracted frames, and the resulting animated GIFs was Google Cloud Storage. It’s easy to use, integrates well with mobile devices, provides strong consistency, and automatically scales to handle large amounts of traffic, should our demo become popular. 
We also configured the Cloud Storage buckets with Object Change Notifications that kicked off the back-end video processing when the Recording app uploaded new video from the phones.
Extracting a frame that corresponds to an audio cue
Finding the frame corresponding to the audio cue or beep poses challenges. Audio and video are recorded with different qualities and at different sample rates, so it takes some work to match them up.. We needed to find the frame that matched the noisiest section of the audio.To do so, we grouped the audio frames into frame intervals, each interval containing the audio that roughly corresponded to a single video frame. We computed the average noise of each interval by calculating the average of the squared amplitude of the samples. Once we identified the interval with the largest average noise, we extracted the corresponding video frame as a PNG file.
We wrote the extractor process in Python and used MoviePy, a module for video editing that uses the FFmpeg framework to handle video encoding and decoding.
Merging frames into an animated GIF
The process of generating an animated GIF from a set of video frames can be done with only four FFmpeg commands, run by a Bash script. First we generate a video by stitching all the frames together in order, extract a color palette, and then use it to generate a lower-resolution GIF to upload to Twitter.
Making the video processing run quickly
Processing the videos one-by-one on a single machine would take longer than we wanted. Next participants to have to wait to see their animated GIF. Cloud Spin takes 19 videos for each demo, one from each phone in the 180-degree arc.If extracting the synchronized frame from each video takes 5 seconds, and merging the frames takes 10 seconds, with serial processing the time between taking the demo shot and the final animated video would be (19 * 5s + 10s = 110s), almost two minutes!
We can make the process faster by parallelizing the frame extraction. If we use 19 virtual machines, one to process each video, the time between the demo shot and the animated GIF is only 15 seconds. To make this improvement work, we had to modify our design to handle synchronization of multiple machines.
Parallelizing the workload
We developed the extraction and stitching process as independent applications. This made it easy to parallelize the frame extraction. We can run 19 extractors and one stitcher, each  as a Docker container on Google Compute Engine.
But how do we make sure that each video is processed by one, and only one, extractor? Google Cloud Pub/Sub is a messaging system that solves this problem in a performant and scalable way. Using Cloud Pub/Sub, we can create a communication channel that is loosely coupled across subscribers.This means that the extractor and stitcher applications  interact through Cloud Pub/Sub, with no assumptions about the underlying implementation of either application. This makes future evolutions of the infrastructure easier to implement.
The preceding diagram  shows two Cloud Pub/Sub topics that act as processing queues for the extractor and the stitcher applications. Each time a mobile app uploads a new video, Cloud Pub/Sub publishes a message on the videos topic. The extractors subscribe to the videos topic. When a new message is published, the first extractor to pull it down has a lease on the message during which it processes  the video in order to extract the frame corresponding to the audio cue. If the processing completes successfully, the extractor  acknowledges the videos message to Cloud Pub/Sub, which causes Cloud Pub/Sub to publish a new message to the frames topic. If the extractor process fails, the lease on the videos message expires and Cloud Pub/Sub republishes the message, where it can be handled by another extractor.
When a message is published on the frames topic the stitcher pulls it down and waits until all of the frames of a single session are ready to be stitched together into an animated GIF. In order for the stitcher application to detect when it has all the frames, it needs a way to check the real-time status of all of the frames in a session.
Managing frame status with Firebase
Part 2 discussed how we managed orchestrating the phones to take simultaneous video using a Firebase database that provides real-time synchronization.
We also used Firebase to track the process of extracting the frame from each camera that corresponds to the audio cue. To do so, we added a status field to each extracted frame in the session as shown in the following screenshot.
When the Android phone takes the video it sets this status to RECORDING and then to UPLOADING when it uploads the video to Cloud Storage. The extractor process sets the status to READY when the frame matching the audio cue has been extracted. When all of the frames in a session are set to READY the stitcher process combines the extracted frames into an animated GIF and stores the GIF in Cloud Storage and its path on Firebase.
Having  the status stored in Firebase made it possible for us to create a dashboard that showed, in real time, each step of the processing of the video taken by the Android phones and the resulting animated  GIF.
We finished development of the Cloud Spin backend in time for the Google Cloud Platform Next events, and together with the mobile apps that captured the video, ran a successful demo. 
You can find more Cloud Spin animations on our Twitter feed: @googlecloudspin. We plan to release the complete demo code. Watch Cloudspin on GitHub for updates.
This is the final post in the three-part series on Google Cloud Spin. I hope you had as much fun discovering this demo as we did building it. Our goal was to demonstrate the possibilities of Google Cloud Platform in a fun way, and inspire you build something awesome!
- Posted by Francesc Campoy Flores, Google Cloud Platform