Part V of a lightly-edited partial transcript of a panel discussion titled “Making SQL Great Again (SQL is Huuuuuge)” at YesSQL Summit 2016 organized by the Northern California Oracle Users Group (NoCOUG) at Oracle Corporation’s headquarters in Redwood City, California. NoCOUG is the longest-running and most-active Oracle users group in the world. An individual membership only costs $95 and entitles the member to free admission to the four consecutive quarterly NoCOUG conferences (one-day events) that follow the membership’s start date, the winter conference being the first day of YesSQL Summit. You can become a member at http://nocoug.org/join.html.
The panelists were Andrew (Andy) Mendelsohn (Executive Vice-President, Database Server Technologies, Oracle), Graham Wood (Architect, Oracle), Bryn Llewellyn (Distinguished Product Manager, Oracle), Hermann Baer (Senior Director, Product Manager, Oracle), Steven Feuerstein (Architect, Oracle). The moderator was Kyle Hailey, an Oracle ACE Director and member of the OakTable Network. The complete video of the panel discussion has been published by Oracle Corporation on the Oracle Channel on YouTube.
Part I Part II Part III Part IV Part V Part VI Part VII
[If SQL is the best language for Big Data, what explains the rise of Hadoop?]
Hermann Baer: Andy alluded to this a little bit before. The short answer here would be—why Hadoop—it’s because it’s cheap and people think it’s cool. Although, as Andy said, the coolness is starting to fade off a little bit and whether it was ever as cool as Oracle red shoes is a different story. But now a little bit more serious. When we look at the question itself, the question itself isn’t very clearly defined because it begs “what is Hadoop?” and “what is SQL?” and “what are these things that we compare with each other?” As Andy said before, Hadoop has multiple parts, some dealing with the persistence of data and some others dealing with the attempts of processing data and the same is actually true with the notion of SQL in that question.
People probably, when this question comes up, automatically imply that when I talk about SQL, I am talking about fixed-structured data that persists somewhere, somehow, on a physical device and this is actually not true. When we talk about SQL, or at least how we see the world here … we talk about the processing of data where it’s not necessarily relevant where the data is stored, so talking about SQL as a development language or data manipulation/data access language, it doesn’t matter where the data resides in fixed structures of data or whether the data is defined in a more loose context—whether this is a JSON document or anything else—or whether the data is even stored outside the database and I think that is where Hadoop comes into the picture looking forward. As Andy said, HDFS is definitely a capable, cool, distributed filesystem, and it’s cheap, undoubtedly. Someone has to take care of it, someone has to manage the data, and it definitely makes life easier to put data into any kind of persistence structure where there’s no structure to begin with—that is similar to NoSQL. You just put it in and it makes it easier just to potentially get data in but, at some point in time, the rubber hits the road and people want to do something with the data and that is where some kind of structure comes into place as well. And this is where we actually want to combine these two worlds and we put a lot of effort into the development of SQL as the data processing language—make it easy and efficient to work on any kind of structure. We started actually back in the 9i days with the introduction of external tables which was the very first step into making Oracle a data-processing engine and this is what we are subsequently developing towards and we have made big inroads here with the combination of data stored inside the database and data stored outside the database with Big Data SQL where we’re trying optimize the processing for multiple different shades of data containers to work as efficiently and as scalable as possible with the data so I think that is where we have to clearly say when we look at what we are doing here from the SQL perspective, we’re dealing first and foremost with the data processing and then we’re optimizing whatever lies behind there in the data persistence layer to make this as unified, as global, and as effective as possible.
Panelists introduce themselves and tell their stories.
Why are we even having this discussion? Why is it necessary to defend SQL? Are NoSQL and Hadoop temporary phenomena that will eventually fade away just like object-oriented database management systems?
The NoSQL folks claim that NoSQL is “web scale”. Are relational database management systems “web scale”? How does PL/SQL fit into the performance picture? Is PL/SQL “web scale”?
Why does Oracle Corporation sell a NoSQL DBMS?
What is the Oracle Developer Advocates team doing to defend RDBMS?
What is Oracle doing to fend of NoSQL and Hadoop?
Copyright © 2016 Iggy Fernandez