Google Research Blog
The latest news from Research at Google
Google at SIGMOD/PODS 2012
Friday, July 13, 2012
Posted by
Anish Das Sarma
, Research Scientist and Jeff Shute, Software Engineer
Over the years,
SIGMOD
has expanded beyond a traditional "database" conference to include several areas related to information management. This year’s
ACM SIGMOD/PODS conference (on Management of Data, and Principles of Database Systems)
, held in Scottsdale, Arizona was no different. We were impressed by the wide variety of researchers from industry and academia alike the conference attracted, and enjoyed learning how others are pushing the limits of scalability in data storage and processing. In addition to an excellent set of papers on a large number of topics, we saw a couple of recurring themes:
1)
Data Visualization
Pat Hanrahan
from Stanford gave a keynote on some of the challenges involved in building systems to enable "data enthusiasts" to manage and visualize data.
Google’s
Fusion Tables
group also had a paper on this topic:
Efficient Spatial Sampling of Large Geographical Tables
, by Anish Das Sarma, Hongrae Lee, Hector Gonzalez, Jayant Madhavan, Alon Halevy. (This paper has been invited to a TODS special issue on best papers of SIGMOD 2012).
A similar effort from the University of Washington was presented as a demo:
VizDeck: Self-Organizing Dashboards for Visual Analytics
, by Alicia Key, Bill Howe, Daniel Perry, Cecilia Aragon.
2)
Big Data
As has been the case for the last couple of years, “Big Data" has been of ever-growing interest to the entire community, particularly from industry. Google presented a talk on
F1
, a new distributed database system we’ve built to power the AdWords system. A complex business application like AdWords has different requirements than many systems at Google that often use storage systems like Bigtable. We have a single database shared by hundreds of developers and systems, so we need the robustness and ease of use we’re used to from traditional databases. F1 is built to scale like Bigtable, without giving up the database features we also need, like strong consistency, ACID transactions, schema enforcement, and most importantly, SQL query.
There’s been a widespread trend over the last several years away from databases, towards highly scalable “NoSQL” systems. We don’t think that trade-off is necessary, and were happy to see several other speakers advocate a similar theme -- yes, databases are useful, and developers shouldn’t need to give up database features and ease of use in the name of scalability.
This theme was supported by an industry session on Big Data featuring talks from other companies: Facebook (TAO: How Facebook Serves the Social Graph), Twitter (Large-Scale Machine Learning at Twitter), and Microsoft (Recurring Job Optimization in Scope). Googler
Kirsten LeFevre
was a panelist on the "Perspectives on Big Data" panel organized by
Surajit Chaudhuri
from Microsoft, and also featuring
Donald Kossmann
from ETHZ,
Sam Madden
from MIT, and Anand Rajaraman from Walmart Labs. Last but not the least, Surajit Chaudhuri also gave an excellent keynote outlining some of the research challenges that the new era of "Big Data and Cloud" poses.
As has been the practice for several years now, to continue generating great interest in data management research, SIGMOD has been organizing panels such as this year's "New Research Symposium" (which included
Anish Das Sarma
from Google as a panelist).
In addition to sponsoring the conference, many Googlers attended contributing to a robust presence and affording us the opportunity to interact with the broader information management community. We've been pushing the frontiers of science with cutting-edge research in many aspects of data management, and we were eager to share our innovations and see what others have been working on. We found
Amin Vahdat's
keynote on the intersection of Networking and Databases to be a highlight of Google’s participation, which also included presenting papers, participating on panels, and taking part in planning and program committees:
Program Committee Members
Anish Das Sarma
, Venkatesh Ganti,
Zoltan Gyongyi
,
Alon Halevy
(Tutorials Chair),
Kristen LeFevre
,
Cong Yu
Talks
Symbiosis in Scale Out Networking and Data Management
Amin Vahdat, Google (Keynote)
F1-The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business
Jeff Shute, Mircea Oancea, Stephan Ellner, Ben Handy, Eric Rollins, Bart Samwel, Radek Vingralek, Chad Whipkey, Xin Chen, Beat Jegerlehner, Kyle Littlefield, Phoenix Tong (Googlers)
Finding Related Tables
Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Cong Yu (Googlers)
Papers
CloudRAMSort: Fast and Efficient Large-Scale Distributed RAM Sort on Shared-Nothing Cluster
Changkyu Kim, Jongsoo Park, Nadathur Satish, Hongrae Lee (Google), Pradeep Dubey, Jatin Chhugani
Efficient Spatial Sampling of Large Geographical Tables
Anish Das Sarma, Hongrae Lee, Hector Gonzalez, Jayant Madhavan, Alon Halevy (Googlers)
Panels
Perspectives on Big Data Plenary Session: Privacy and Big Data
Kristen LeFevre, Google
SIGMOD New Researcher Symposium - How to be a good advisor/advisee?
Anish Das Sarma, Google
Overall, this year’s SIGMOD was a great conference, widely attended by researchers from industry and academia, and comprised of a very interesting mix of research presentations and discussions. Google had a good showing at the conference, and we look forward to continuing this trend in the coming years.
Labels
accessibility
ACL
ACM
Acoustic Modeling
Adaptive Data Analysis
ads
adsense
adwords
Africa
AI
Algorithms
Android
Android Wear
API
App Engine
App Inventor
April Fools
Art
Audio
Australia
Automatic Speech Recognition
Awards
Cantonese
Chemistry
China
Chrome
Cloud Computing
Collaboration
Computational Imaging
Computational Photography
Computer Science
Computer Vision
conference
conferences
Conservation
correlate
Course Builder
crowd-sourcing
CVPR
Data Center
Data Discovery
data science
datasets
Deep Learning
DeepDream
DeepMind
distributed systems
Diversity
Earth Engine
economics
Education
Electronic Commerce and Algorithms
electronics
EMEA
EMNLP
Encryption
entities
Entity Salience
Environment
Europe
Exacycle
Expander
Faculty Institute
Faculty Summit
Flu Trends
Fusion Tables
gamification
Gmail
Google Books
Google Brain
Google Cloud Platform
Google Docs
Google Drive
Google Genomics
Google Maps
Google Photos
Google Play Apps
Google Science Fair
Google Sheets
Google Translate
Google Trips
Google Voice Search
Google+
Government
grants
Graph
Graph Mining
Hardware
HCI
Health
High Dynamic Range Imaging
ICLR
ICML
ICSE
Image Annotation
Image Classification
Image Processing
Inbox
Information Retrieval
internationalization
Internet of Things
Interspeech
IPython
Journalism
jsm
jsm2011
K-12
KDD
Klingon
Korean
Labs
Linear Optimization
localization
Low-Light Photography
Machine Hearing
Machine Intelligence
Machine Learning
Machine Perception
Machine Translation
Magenta
MapReduce
market algorithms
Market Research
Mixed Reality
ML
MOOC
Moore's Law
Multimodal Learning
NAACL
Natural Language Processing
Natural Language Understanding
Network Management
Networks
Neural Networks
Nexus
Ngram
NIPS
NLP
On-device Learning
open source
operating systems
Optical Character Recognition
optimization
osdi
osdi10
patents
ph.d. fellowship
PhD Fellowship
PhotoScan
PiLab
Pixel
Policy
Professional Development
Proposals
Public Data Explorer
publication
Publications
Quantum Computing
renewable energy
Research
Research Awards
resource optimization
Robotics
schema.org
Search
search ads
Security and Privacy
Semi-supervised Learning
SIGCOMM
SIGMOD
Site Reliability Engineering
Social Networks
Software
Speech
Speech Recognition
statistics
Structured Data
Style Transfer
Supervised Learning
Systems
TensorFlow
TPU
Translate
trends
TTS
TV
UI
University Relations
UNIX
User Experience
video
Video Analysis
Virtual Reality
Vision Research
Visiting Faculty
Visualization
VLDB
Voice Search
Wiki
wikipedia
WWW
YouTube
Archive
2017
May
Apr
Mar
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Jul
May
Apr
Mar
Feb
2007
Oct
Sep
Aug
Jul
Jun
Feb
2006
Dec
Nov
Sep
Aug
Jul
Jun
Apr
Mar
Feb
Feed
Google
on
Follow @googleresearch
Give us feedback in our
Product Forums
.