How GPU Database Technologies Are Shaping the Future of Data and Analytics
Kinetica recently held a webinar to share how a GPU database stands to help enterprises capture, analyze, and act on data in real time. Jim Curtis, Senior Analyst for 451 Research, and Kinetica’s Manan Goel, VP of Products, detailed how GPUs are enabling new commercial and consumer applications in databases, artificial intelligence, and real-time analytics.
A summary of key industry observations made by Jim Curtis is included below:
Before going over the growth and opportunity in the analytic database market, let’s review how analytics fits among the five major types of database technology:
- Operational (Relational, NoSQL, NewSQL): Used to support and drive operational business applications, storing records of customers, products, suppliers and keeping of daily business transactions.
- Analytic (Analytical, Data Warehouse, GPU, Hadoop/Spark) Used to support analytical functions and activities and are often used alongside transactional systems. Includes Hadoop storage processing framework as well as the Spark processing framework.
- Search-based Highly complex; use to better structure and order corporate data, which can be designed to be searchable by others.
- Event/Stream Proc Used to rapidly analyze things that happen (events) and to drive actions on those events.
- Data Grid/Cache Creates an in-memory layer between the application and the database for fast querying.
GPU databases and GPU-accelerated platforms fit in the analytical database segment. Analytical databases can best be described as those that support analytical functions, and which often sit beside transitional or transactional systems as an integrated whole. Hadoop, Spark, and other processing frameworks can also be included as part of that, as they also contribute to the overall analytical strategy.
Growth and opportunity in the analytic database market
The total analytic data platform market is growing – the entire market is going to grow at about 12% CAGR over a six-year period, and the analytical database market is growing collectively at about 10%. What is interesting is that some of the real growth and activity is occurring at the emerging end of this market; that’s because databases and the SQL model have been around for the better part of 40 years, and there are new innovations which address some of the changes that companies are experiencing.
Major changes in the database market
A lot of the changes in the database market are based on market requirements, customer needs, data, and technology trends:
- Need more than a singular system – need multiple systems
- Collection of and analysis on otherwise unused data
- Moving from proprietary hardware to commodity hardware
- Shifting environment: on-premises, cloud, and hybrid
- Leveraging hardware acceleration for analysis, think GPUs
You can no longer buy a singular database and expect it to address all of your analytical needs and the processes that are driven from that. Now we’re seeing multiple systems integrating together. At 451, we have a term that we use called total data warehouse—a term has also been called a virtual data warehouse—where you have several systems that are functioning together. You can capture and collect pretty much any kind of data that you want, persist that in some way, and then feed that up to another system.
There is also a trend of moving away from proprietary hardware to commodity hardware, which offers a couple of benefits: The first is that it shrinks costs; it also invites scaling, which is important because as data grows, systems need to accommodate that growth. Environments are also shifting with options including on-premise, cloud, and hybrid. Often users don’t really care about where some of these things live, but these environments are becoming blended, and databases and other persistence and analytics are occurring in of all of these environments.
We’re also seeing innovation with hardware acceleration and GPUs. Software still needs hardware to process it, and hardware can provide performance advantages with various workloads.
GPUs: What’s the difference from CPUs?
CPUs and GPUs have fundamentally different architectures: CPUs consist of multiple cores, while GPUs consist of thousands of cores; CPUs are geared for serial operations whereas GPUs are geared for parallel operations. Both are paired together within an environment for the greatest overall optimization.
There are at least two fundamental differences between these technologies. They are in fact different architectures, but we also see that although they are different, they are often paired together in an environment. When they are paired together, you often get the greatest overall optimization, because you can leverage each of them to do the things that they’re good at, and that gives you a great deal of benefit. Generally speaking, CPUs are very good at serial operations – operations that would occur in some kind of a sequence. GPUs, on the other hand, will do parallel operations. Obviously, there are great performance advantages in that. Not every activity can be parallelized, but for those that can, there are great benefits.
CPUs often consist of multiple cores, but GPUs have thousands of cores. Again, this is how you get parallelization and orders of magnitude increases in performance.
What’s required for analytics?
GPUs provide a great deal of benefit for analytics. If you’re doing analytics at a high level, what sort of things do you need? You obviously need data, but you also need to have methods for sorting or operating on that data, whether it’s a querying tool, algorithms, or tools to access and perform operations on that data. You then need processing capability to be able to process those methods.
Today, things look like the above diagram, where SQL is often the query tool of choice. Data might be used to feed a BI workload or a dashboard of some sort; this is a very common environment today.
How do GPUs extend the analytical benefits? The main benefit is that you get performance acceleration; that parallel processing gives you a great deal of advantage. Speed matters in a lot of things: time to market matters, and responding to customers matters. All of these things matter when performance is at play. The other benefit is that you can deal with large data sets; your processing power allows you to deal with a much bigger piece of data.
It also turns out that things like deep learning are great fits for GPUs, because you can parallelize some of that complex math and you can run those in parallel to build models. So deep learning is a big consumer of GPU technology to be able to do that, and it does it at performance scale.
GPUs make real-time query response a reality. If you can query and process your data quickly, you can feed it to reports and dashboards in real time. We’re also seeing benefits where you pair GPUs with visualization tools; you also get much greater benefit than you would over another visualization tools. That’s because when you optimize that visualization tool for GPUs, you can drill down to all of the data. These are just some of the basic benefits that we see in a GPU database.
Where is this all leading?
You may be familiar with the term “digital transformation” — Digital transformation is basically when organizations are trying to invest in new technologies and processes to effectively engage customers, partners, and employees. That also leads to operational costs and operational benefits.
Our research shows that it’s still early days with this, but in our survey, 40% of respondents are either just starting or in the midst of their strategy. That leaves another 60% that are either are sort of ignoring it, or are at various levels along the adoption cycle. Data is driving a lot of these changes, with the need for analytics, and the need to respond in certain ways. All these are causing companies to look to certain technologies and adopt certain processes to remain competitive in this market.
The Road Ahead
Reaping the benefits of adopting these technologies in this digital transformation is not going to happen overnight; it will be an “evolving nature” sort of development, but the benefits can be quite significant for companies. GPUs can transform how organizations operate, and they can transform how these companies can serve customers. GPUs may disrupt businesses and markets, and new businesses or business models might evolve to take advantage of competitive advantage and lower costs. This is a trend that most companies are looking at.
Key takeaways
There are a couple of key takeaways. At 451 Research, we see a tremendous amount of growth in the analytical database and platform market, and we particularly see that on the emerging side.
Analytics will still be key, and people are going to need to turn to the GPU database to be able to manage that. Analytics, and particularly advanced analytics, are a very good fit for GPUs, because of the parallelization nature of GPUs; you can run workloads on that that would otherwise take significantly longer if you were just trying to use CPUs. Analytics will play a big part with organizations going forward, and this new technology can support that.
Digital transformation is a trend as we mentioned, and since these technologies are a part of that trend, customers are now able to use these technologies to better adapt to the changing conditions around them.