The open source R language has been in use by statisticians to do data analysis, predictive modeling and visualization for over a decade. This week, R is set to undergo a revolution of sorts with a revamped commercial effort that is aiming to advance the language’s adoption.
Commercial R vendor Revolution Computing has been in the market for the past two years and is now rebranding itself as Revolution Analytics. The company is also rolling out a new roadmap for its R tools that it hopes will expand the market.
“We’ve seen R really start to spill over into the commercial world in recent years,” Jeff Erhardt, chief operating officer at Revolution Analytics, told InternetNews.com. “The market for R is anywhere there is a company that is collecting a lot of data and is looking to get an edge by doing analysis of that data or forecasting. We get involved with investment banks and hedge funds and we also see it used in the pharmaceutical industry.”
At Revolution Analytics, the company is building on top of the open source R language with its Revolution R Enterprise product, providing additional tools and capabilities to R users. Erhardt noted that the Enterprise product provides additional scalability beyond that which is available in the core open source project.
“We make R run fast on modern hardware,” Erhardt said. “The open source project in general only makes use of one CPU or core, while our version makes use of all CPUs and cores that are present in hardware.”
David Smith, vice president of community at Revolution Analytics, explained to InternetNews.com that the way R is currently set up, there is the core language and then there are packages and applications built on top of that. In his view, it is the combination of the core R programming language and the additional packages that actually comprise what people think of when they consider R.
One of the new packages that Revolution Analytics is working on is an approach to further expand the data scalability of R.
“Today, R is memory-bound and cannot handle large data sets,” Smith said. “So we’ll be bringing out in the near future a solution for handling massive datasets in the terabyte and petabyte range.”
Erhardt added that Revolution Analytics is agnostic in terms of how it actually interfaces with data sources.
“In terms of the internal implementation, we’ve developed an intermediate file format that is based on the NoSQL model,” Erhardt said. “But it is neither column- nor row-based and has been designed with the needs of statistical algorithms in mind.”
Revolution Analytics is also working on R usability. Smith noted that R, to date, has been a command-line-driven language, which has represented a challenge for some mainstream enterprise users.
“So we’ll be rolling out a modern, flexible and extensible thin client user interface that can bridge the gap between expert users and the more casual business analysts,” Smith said.