This article by Gordon Haff provides some interesting insights into Intel's thinking around software parallelization. At Pervasive, we talk about the trade-off of using Java to build a framework like DataRush versus a "lower level" programming language (i.e. closer to the metal). I think this quote sums it up nicely:
Reinders began by noting that developers fall into roughly two groups when it comes to parallel programming: those who are still concerned about ultimate performance even in a parallel world and those who are just looking for a way to deal with it at all.
Reinders goes on to say:
And we need to be a little more relaxed about the performance. The people who start asking me about efficiency in every last cycle used and such--I characterize them as people we need to talk to more about our high-performance computing-oriented tools that give you full control. And other people are "I don't even know how to approach parallelism." I think there is a different set of ways to talk about the problem.
The High Performance Computing (HPC) world falls into that first camp. They will always be looking to obtain every last ounce of performance they can squeeze from a machine. That can incur the cost of ease of programming, debugging and profiling. The HPC community is relatively small compared to the huge number of developers using languages like Java that are not MPI/OpenMP experts.
Pervasive DataRush is targeted at the business developer with large sets of data to profile, process, clean, analyze, and mine. They need to focus on their target problem and not waste time worrying about threading, synchronization, race conditions and so on. We need a programming paradigm change. DataRush, using a dataflow architecture, is moving in that direction in the domain of data-intensive (data parallel) applications.
But aren't we leaving behind performance by using Java? Could we squeeze another 5% of performance by using C or C++? Maybe, maybe not: that's an argument for another day. Either way, I'm willing to make that concession. Another way of looking at it is this: what about all of the performance we are enabling developers to gain that wouldn't do so otherwise? By making parallel programming easier and more accessible, especially data parallelism, we can affect a large part of the business computing market. Data analysis and data mining are areas that are experiencing heavy growth today. DataRush makes building analytics that are parallel easy, enabling big performance gains without a huge development cost. Not a bad trade-off.