Thursday, June 30, 2016

SQL - R Services in SQL Server 2016

What is R?
R is the most popular open sourced software language which is used for advanced analytics analysis, uncover patterns, identify trends or even build predictive models. It has a large and vibrant community and thousands of packages covering multiple domains and algorithms. It plays a tremendous role for interactive data analysis, data visualisation and predictive modeling.
Challenges with R Even though R has the great analysing features, business organisations seeking to accept R in production are hammering numerous significant challenges:
  • Scale and performance: While R is open-source and free, R packages run single-threaded and can consequently only consume datasets that fit into available memory on the local machine. This memory constraint becomes a critical drawback as data sizes grow or when wishing to speed up execution.
  • Integrating with applications: R code can be easily integrated with business applications in production but many enterprises end-up translating the R code into different languages in order to integrate it. This process is typically cumbersome, costly and error-prone and doesn’t support agile innovation. Enterprises need a simple way for applications to retrieve predictions, scores and plots, schedule R jobs and integrate it with existing application logic.
  • Data movement: Moving data out of the database is best avoided when possible. Having to transfer data for analysis on a separate machine running R can present security headaches and latency issues when data volumes grow. It also tends to create data fragmentation because different people in the organization find themselves working on different versions of the data.
While R is open-source and free, addressing the challenges above introduces time and cost considerations. Innovation and agility suffer if you’re limited in the volumes of data you can process in production, if processing data takes way too long, or if you need to set up workflows to move data around. Similarly, costs can rise significantly if you need a development team to re-code the R scripts or spend extra money on special hardware.

What is SQL Server R Services?
R Services is the latest feature of SQL Server 2016 which is fully loaded with accelerating innovations such as allowing building the smart applications with R, deploying intelligent applications that uncover new insight which are capable to retrieve predictions and visuals using Transact-SQL.

Integration R with SQL Server means brings together the best of the two worlds where R Services support in-database analytics allowing you to keep the data in SQL Server 2016 and have applications call R scripts via a T-SQL system stored procedure. This makes application integration with R easier than ever. R scripts execute in a secure fashion on the SQL Server box so you are protected from R code trying to reach over to the Internet or to other jobs running on the SQL Server.

Apart from this; we have to facility to work on the large datasets which provides the multiple benefits-
  1. It brings the R applications to the production through the most familiar T-SQL interface. This is because most of the developers are aware about the T-SQL and they can easily integrate R with their T-SQL code and use it through the SQL Server.
  2. It avoids the need of data movement between our data and R computation and R computation will be running on the same server as the database in DB analytics. It facilitates to work on the same copy of the data in the data store.
  3. It provides a superior performance and scale for common operations, over open source R while allowing the use of any R package. It allows scaling for large datasets, far exceeding the available memory on the machine.
There is the facility to install SQL Server R Services where we can embed R code in a new system stored procedure – sp_execute_external_script.
Important- R Services builds on a new extensibility framework in SQL Server that allows for secure execution of external runtimes. This security mechanism is handled by special system stored procedure which starts new process that executes the R code and exchange the data with the SQL Engine.

R Services supports resource governance similar to other SQL Server features means we to facility to cap the CPU and memory allocated to R runtime. 

Conclusion
R Services support in-database advance analytics allowing users to keep the data in SQL Server 2016 and customers can benefit from the included ScaleR algorithm library, a set of functions that provide equivalents for routine tasks performed in R. Data Scientists have multiple tools for R such as R IDE, R Studio or R Tools for Visual Studio. To control resources and to assign permissions, Server Administrators can use the familiar SQL Server Management Studio to manage R Services.
References: Microsoft

2 comments:

  1. Please go through my profile in linkedin.

    Please let me know what are the prerequisite skills.
    Kindly mail me at vidyapeeta@gmail.com

    ReplyDelete
  2. Mukesh,

    I am interested in learning R to get my skills upgraded in IIoT. Please go through my profile in linkedin. Suggest me with your valuable advice. I cannot spend in $$$$ or lakhs of rupees Please

    ReplyDelete