Google Research Blog
The latest news from Research at Google
Graph-powered Machine Learning at Google
Thursday, October 06, 2016
Posted by Sujith Ravi, Staff Research Scientist, Google Research
Recently, there have been significant advances in
Machine Learning
that enable computer systems to solve complex real-world problems. One of those advances is Google’s large scale,
graph-based
machine learning platform, built by the Expander team in Google Research. A technology that is behind many of the Google products and features you may use everyday, graph-based machine learning is a powerful tool that can be used to power useful features such as
reminders in Inbox
and
smart messaging in Allo
, or used in conjunction with deep neural networks to power the latest image recognition system in
Google Photos
.
Learning with Minimal Supervision
Much of the recent success in
deep learning
, and machine learning in general, can be attributed to models that demonstrate high predictive capacity when trained on large amounts of labeled data -- often millions of training examples. This is commonly referred to as “
supervised learning
” since it requires supervision, in the form of labeled data, to train the machine learning systems. (Conversely, some machine learning methods operate directly on raw data without any supervision, a paradigm referred to as
unsupervised learning
.)
However, the more difficult the task, the harder it is to get sufficient high-quality labeled data. It is often prohibitively labor intensive and time-consuming to collect labeled data for every new problem. This motivated the Expander research team to build new technology for powering machine learning applications at scale and with minimal supervision.
Expander’s technology draws inspiration from how humans learn to generalize and bridge the gap between what they already know (labeled information) and novel, unfamiliar observations (unlabeled information). Known as “
semi-supervised
” learning, this powerful technique enables us to build systems that can work in situations where training data may be sparse. One of the key advantages to a graph-based semi-supervised machine learning approach is the fact that (a) one models labeled and unlabeled data
jointly
during learning, leveraging the underlying structure in the data, (b) one can easily combine multiple types of signals (for example, relational information from
Knowledge Graph
along with raw features) into a single graph representation and learn over them. This is in contrast to other machine learning approaches, such as neural network methods, in which it is typical to
first
train a system using labeled data with features and
then
apply the trained system to unlabeled data.
Graph Learning: How It Works
At its core, Expander’s platform combines semi-supervised machine learning with large-scale graph-based learning by building a multi-graph representation of the data with nodes corresponding to objects or concepts and edges connecting concepts that share similarities. The graph typically contains both labeled data (nodes associated with a known output category or label) and unlabeled data (nodes for which no labels were provided). Expander’s framework then performs semi-supervised learning to label all nodes jointly by propagating label information across the graph.
However, this is easier said than done! We have to (1) learn efficiently at scale with minimal supervision (i.e., tiny amount of labeled data), (2) operate over multi-modal data (i.e., heterogeneous representations and various sources of data), and (3) solve challenging prediction tasks (i.e., large, complex output spaces) involving high dimensional data that might be noisy.
One of the primary ingredients in the entire learning process is the graph and choice of connections. Graphs come in all sizes, shapes and can be combined from multiple sources. We have observed that it is often beneficial to learn over multi-graphs that combine information from multiple types of data representations (e.g., image pixels, object categories and chat response messages for
PhotoReply in Allo
). The Expander team’s graph learning platform automatically generates graphs directly from data based on the inferred or known relationships between data elements. The data can be structured (for example,
relational data
) or unstructured (for example,
sparse
or dense feature representations extracted from raw data).
To understand how Expander’s system learns, let us consider an example graph shown below.
There are two types of nodes in the graph: “grey” represents unlabeled data whereas the colored nodes represent labeled data. Relationships between node data is represented via edges and thickness of each edge indicates strength of the connection. We can formulate the semi-supervised learning problem on this toy graph as follows:
predict a color (“red” or “blue”) for every node in the graph
. Note that the specific choice of graph structure and colors depend on the task. For example, as shown in
this research paper
we recently published, a graph that we built for the
Smart Reply feature in Inbox
represents email messages as nodes and colors indicate semantic categories of user responses (e.g., “yes”, “awesome”, “funny”).
The Expander graph learning framework solves this labeling task by treating it as an optimization problem. At the simplest level, it learns a color label assignment for every node in the graph such that neighboring nodes are assigned similar colors depending on the strength of their connection. A naive way to solve this would be to try to learn a label assignment for all nodes at once -- this method does not scale to large graphs. Instead, we can optimize the problem formulation by propagating colors from labeled nodes to their neighbors, and then repeating the process. In each step, an unlabeled node is assigned a label by inspecting color assignments of its neighbors. We can update every node’s label in this manner and iterate until the whole graph is colored. This process is a far more efficient way to optimize the same problem and the sequence of iterations converges to a unique solution in this case. The solution at the end of the graph propagation looks something like this:
Semi-supervised learning on a graph
In practice, we use complex optimization functions defined over the graph structure, which incorporate additional information and constraints for semi-supervised graph learning that can lead to hard,
non-convex
problems. The
real
challenge, however, is to scale this efficiently to graphs containing billions of nodes, trillions of edges and for complex tasks involving billions of different label types.
To tackle this challenge, we created an approach outlined in
Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation
, published last year. It introduces a
streaming algorithm
to process information propagated from neighboring nodes in a distributed manner that makes it work on very large graphs. In addition, it addresses other practical concerns, notably it guarantees that the space complexity or memory requirements of the system stays constant regardless of the difficulty of the task, i.e., the overall system uses the same amount of memory regardless of whether the number of prediction labels is two (as in the above toy example) or a million or even a billion. This enables wide-ranging applications for natural language understanding, machine perception, user modeling and even joint
multimodal
learning for tasks involving multiple modalities such as text, image and video inputs.
Language Graphs for Learning Humor
As an example use of graph-based machine learning, consider
emotion labeling
, a language understanding task in
Smart Reply for Inbox
, where the goal is to label words occurring in natural language text with their fine-grained emotion categories. A neural network model is first applied to a text corpus to learn word embeddings, i.e., a mathematical vector representation of the meaning of each word. The dense embedding vectors are then used to build a sparse graph where nodes correspond to words and edges represent semantic relationship between them. Edge strength is computed using similarity between embedding vectors — low similarity edges are ignored. We seed the graph with emotion labels known
a priori
for a few nodes (e.g., laugh is labeled as “funny”) and then apply semi-supervised learning over the graph to discover emotion categories for remaining words (e.g., ROTFL gets labeled as “funny” owing to its multi-hop semantic connection to the word “laugh”).
Learning emotion associations using graph constructed from word embedding vectors
For applications involving large datasets or dense representations that are observed (e.g., pixels from images) or learned using neural networks (e.g., embedding vectors), it is infeasible to compute pairwise similarity between all objects to construct edges in the graph. The Expander team
solves
this problem by leveraging approximate, linear-time graph construction algorithms.
Graph-based Machine Intelligence in Action
The Expander team’s machine learning system is now being used on massive graphs (containing billions of nodes and trillions of edges) to recognize and understand concepts in natural language, images, videos, and queries, powering Google products for applications like
reminders
,
question answering
,
language translation
,
visual object recognition
,
dialogue understanding
, and more.
We are excited that with the
recent release of Allo
, millions of chat users are now experiencing smart messaging technology powered by the Expander team’s system for understanding and assisting with chat conversations in multiple languages. Also, this technology isn’t used only for large-scale models in the cloud - as
announced this past week
, Android Wear has opened up an
on-device Smart Reply capability
for developers that will provide smart replies for any messaging application. We’re excited to tackle even more challenging Internet-scale problems with Expander in the years to come.
Acknowledgements
We wish to acknowledge the hard work of all the researchers, engineers, product managers, and leaders across Google who helped make this technology a success. In particular, we would like to highlight the efforts of Allan Heydon, Andrei Broder, Andrew Tomkins, Ariel Fuxman, Bo Pang, Dana Movshovitz-Attias, Fritz Obermeyer, Krishnamurthy Viswanathan, Patrick McGregor, Peter Young, Robin Dua, Sujith Ravi and Vivek Ramavajjala.
The 280-Year-Old Algorithm Inside Google Trips
Tuesday, September 20, 2016
Posted by Bogdan Arsintescu, Software Engineer & Sreenivas Gollapudi, Kostas Kollias, Tamas Sarlos and Andrew Tomkins, Research Scientists
Algorithms Engineering
is a lot of fun because algorithms do not go out of fashion: one never knows when an oldie-but-goodie might come in handy. Case in point: Yesterday, Google
announced Google Trips
, a new app to assist you in your travels by helping you create your own “perfect day” in a city. Surprisingly, deep inside Google Trips, there is an algorithm that was invented 280 years ago.
In 1736,
Leonhard Euler
authored a brief but
beautiful mathematical paper
regarding the town of Königsberg and its 7 bridges, shown here:
Image from
Wikipedia
In the paper, Euler studied the following question: is it possible to walk through the city crossing each bridge exactly once? As it turns out, for the city of Königsberg, the answer is no. To reach this answer, Euler developed a general approach to represent any layout of landmasses and bridges in terms of what he dubbed the
Geometriam Situs
(the “Geometry of Place”), which we now call
Graph Theory
. He represented each landmass as a “node” in the graph, and each bridge as an “edge,” like this:
Image from
Wikipedia
Euler noticed that if all the nodes in the graph have an even number of edges (such graphs are called “Eulerian” in his honor) then, and only then, a cycle can be found that visits every edge exactly once. Keep this in mind, as we’ll rely on this fact later in the post.
Our team in Google Research has been fascinated by the “Geometry of Place” for some time, and we started investigating a question related to Euler’s: rather than visiting just the bridges, how can we visit as many interesting places as possible during a particular trip? We call this the “itineraries” problem. Euler didn’t study it, but it is a well known topic in Optimization, where it is often called the “
Orienteering
” problem.
While Euler’s problem has an efficient and exact solution, the itineraries problem is not just hard to solve, it is hard to even
approximately
solve! The difficulty lies in the interplay between two conflicting goals: first, we should pick great places to visit, but second, we should pick them to allow a good itinerary: not too much travel time; don’t visit places when they’re closed; don’t visit too many museums, etc. Embedded in such problems is the challenge of finding efficient routes, often referred to as the
Travelling Salesman Problem
(TSP).
Algorithms for Travel Itineraries
Fortunately, the real world has a property called the “
triangle inequality
” that says adding an extra stop to a route never makes it shorter. When the underlying geometry satisfies the triangle inequality, the TSP can be approximately solved using another
algorithm discovered by Christofides
in 1976. This is an important part of our solution, and builds on Euler’s paper, so we’ll give a quick four-step rundown of how it works here:
We start with all our destinations separate, and repeatedly connect together the closest two that aren’t yet connected. This doesn’t yet give us an itinerary, but it does connect all the destinations via a
minimum spanning tree
of the graph.
We take all the destinations that have an odd number of connections in this tree (Euler proved there must be an even number of these), and carefully pair them up.
Because all the destinations now have an even number of edges, we’ve created an Eulerian graph, so we create a route that crosses each edge exactly once.
We now have a great route, but it might visit some places more than once. No problem, we find any double visits and simply bypass them, going directly from the predecessor to the successor.
Christofides gave an elegant proof that the resulting route is always close to the shortest possible. Here’s an example of the Christofides’ algorithm in action on a location graph with the nodes representing places and the edges with costs representing the travel time between the places.
Construction of an Eulerian Tour in a location graph
Armed with this efficient route-finding subroutine, we can now start building itineraries one step at a time. At each step, we estimate the benefit to the user of each possible new place to visit, and likewise estimate the cost using the Christofides algorithm. A user’s benefit can be derived from a host of natural factors such as the popularity of the place and how different the place is relative to places already visited on the tour. We then pick whichever new place has the best benefit per unit of extra cost (e.g., time needed to include the new place in the tour). Here’s an example of our algorithm actually building a route in London using the location graph shown above:
Itineraries in Google Trips
With our first good approximate solution to the itineraries problem in hand, we started working with our colleagues from the Google Trips team, and we realized we’d barely scratched the surface. For instance, even if we produce the absolute perfect itinerary, any particular user of the system will very reasonably say, “That’s great, but all my friends say I also need to visit this other place. Plus, I’m only around for the morning, and I don’t want to miss this place you listed in the afternoon. And I’ve already seen Big Ben twice.” So rather than just producing an itinerary once and calling it a perfect day, we needed a fast dynamic algorithm for itineraries that users can modify on the fly to suit their individual taste. And because many people have bad data connections while traveling, the solution had to be efficient enough to run disconnected on a phone.
Better Itineraries Through the Wisdom of Crowds
While the algorithmic aspects of the problem were highly challenging, we realized that producing high-quality itineraries was just as dependent on our understanding of the many possible stopping points on the itinerary. We had Google’s extensive travel database to identify the interesting places to visit, and we also had great data from Google’s existing systems about how to travel from any place to any other. But we didn’t have a good sense for how people typically move through this geometry of places.
For this, we turned to the wisdom of crowds. This type of wisdom is used by Google to
estimate delays on highways
, and to discover
when restaurants are most busy
. Here, we use the same techniques to learn about common visit sequences that we can stitch together into itineraries that feel good to our users. We combine Google's knowledge of
when places are popular
, with the directions between those places to gather an idea of what tourists like to do when travelling.
And the crowd has a lot more wisdom to offer in the future. For example, we noticed that visits to Buckingham Palace spike around 11:30 and stay a bit longer than at other times of the day. This seemed a little strange to us, but when we looked more closely, it turns out to be the time of the
Changing of the Guard
. We’re looking now at ways to incorporate this type of timing information into the itinerary selection algorithms.
So give it a try: Google Trips, available now on
Android
and
iOS
, has you covered from departure to return.
Labels
accessibility
ACL
ACM
Acoustic Modeling
Adaptive Data Analysis
ads
adsense
adwords
Africa
AI
Algorithms
Android
Android Wear
API
App Engine
App Inventor
April Fools
Art
Audio
Australia
Automatic Speech Recognition
Awards
Cantonese
Chemistry
China
Chrome
Cloud Computing
Collaboration
Computational Imaging
Computational Photography
Computer Science
Computer Vision
conference
conferences
Conservation
correlate
Course Builder
crowd-sourcing
CVPR
Data Center
Data Discovery
data science
datasets
Deep Learning
DeepDream
DeepMind
distributed systems
Diversity
Earth Engine
economics
Education
Electronic Commerce and Algorithms
electronics
EMEA
EMNLP
Encryption
entities
Entity Salience
Environment
Europe
Exacycle
Expander
Faculty Institute
Faculty Summit
Flu Trends
Fusion Tables
gamification
Gmail
Google Books
Google Brain
Google Cloud Platform
Google Docs
Google Drive
Google Genomics
Google Maps
Google Photos
Google Play Apps
Google Science Fair
Google Sheets
Google Translate
Google Trips
Google Voice Search
Google+
Government
grants
Graph
Graph Mining
Hardware
HCI
Health
High Dynamic Range Imaging
ICLR
ICML
ICSE
Image Annotation
Image Classification
Image Processing
Inbox
Information Retrieval
internationalization
Internet of Things
Interspeech
IPython
Journalism
jsm
jsm2011
K-12
KDD
Klingon
Korean
Labs
Linear Optimization
localization
Low-Light Photography
Machine Hearing
Machine Intelligence
Machine Learning
Machine Perception
Machine Translation
Magenta
MapReduce
market algorithms
Market Research
Mixed Reality
ML
MOOC
Moore's Law
Multimodal Learning
NAACL
Natural Language Processing
Natural Language Understanding
Network Management
Networks
Neural Networks
Nexus
Ngram
NIPS
NLP
On-device Learning
open source
operating systems
Optical Character Recognition
optimization
osdi
osdi10
patents
ph.d. fellowship
PhD Fellowship
PhotoScan
PiLab
Pixel
Policy
Professional Development
Proposals
Public Data Explorer
publication
Publications
Quantum Computing
renewable energy
Research
Research Awards
resource optimization
Robotics
schema.org
Search
search ads
Security and Privacy
Semi-supervised Learning
SIGCOMM
SIGMOD
Site Reliability Engineering
Social Networks
Software
Speech
Speech Recognition
statistics
Structured Data
Style Transfer
Supervised Learning
Systems
TensorFlow
TPU
Translate
trends
TTS
TV
UI
University Relations
UNIX
User Experience
video
Video Analysis
Virtual Reality
Vision Research
Visiting Faculty
Visualization
VLDB
Voice Search
Wiki
wikipedia
WWW
YouTube
Archive
2017
May
Apr
Mar
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Jul
May
Apr
Mar
Feb
2007
Oct
Sep
Aug
Jul
Jun
Feb
2006
Dec
Nov
Sep
Aug
Jul
Jun
Apr
Mar
Feb
Feed
Google
on
Follow @googleresearch
Give us feedback in our
Product Forums
.