9923170071 / 8108094992 info@dimensionless.in

Introduction

During the most recent decade, the force originating from both the scholarly community and industry has lifted the R programming language. Also, they have worked hard to end up the absolute most significant tool for computational statistics, perception, and data science.

Due to the growth of R in the data science community, there is a constant need to upgrade and develop both R and RStudio. R studio conference is a platform where scholars around the globe come and share their knowledge and developments.

What is the R studio Conference?

Rstudio conference 2019 is all about R and RStudio. Hundreds of advanced and new R users in Austin, Texas from Tuesday, January 15 thru Friday, January 18 came together to become better at data science with R through this conference.

This conference happens every year where the latest trends, developments and goals for next year take place.

Agendas

This year conference happens in two parts i.e workshops and conferences. Conferences emphasise on different speakers sharing their experiences and advancements they have done in the previous year in R. Though, in workshops, hands-on experience happens over various topics like Big data in R, getting hands-on R-shiny on the production level.

The main theme of this session was R in production. Most of the conferences and workshops focus on the production aspect of the application in R whether be it Rshiny, sparkR or even different packages that rolled out in the conference

Conferences

1. Shiny in production: Principles, practices, and tools

Shiny is a web framework for R, a language not traditionally known for web frameworks, to say the least. As such, Shiny has always faced questions about whether it can or should be used “in production”. In this talk, they explored what “production” even means, reviewed some of the historical obstacles and objections to using Shiny for production purposes, and discussed practices and tools that can help your Shiny apps flourish.

2. A guide to modern reproducible data science with R

The session focussed on creating custom computing environments that can be shared and instantly with remote users, packaging small to medium data inside and outside packages, and creating simple to complex workflows to track the provenance of your results. Additionally, this solves the problem developers used to face in while reproducing their work in R in terms of package dependencies, mismatched versions etc

3. R in Production

With the increase in people using R for data science comes an associated increase in the number of people and organisations wanting to put models or other analytic code into “production”. We often hear it said that R isn’t suitable for production workloads, but is that true? Also, in this talk, Mark looked at some of the misinformation around the idea of what “putting something into production” actually means, as well as provided tips on overcoming the obstacles put in your path.

4. Databases using R: The latest

Many of the techniques covered are based on our personal and the community’s experiences of implementing concepts introduced last year, such as offloading most of the data wrangling to the database using dplyr, and using the RStudio IDE to preview the database’s layout and data.

5. Scaling R with Spark

This talk introduced new features in sparklyr that enable real-time data processing, brand new modelling extensions and significant performance improvements. The sparklyr package provides an interface to Apache Spark to enable data analysis and modelling in large datasets through familiar packages like dplyr and broom.

6. Democratizing R with Plumber APIs

The Plumber package provides an approachable framework for exposing R functions as HTTP API endpoints. This allows R developers to create code that can be consumed by downstream frameworks, which may be R agnostic. In this talk, an existing Shiny application that uses an R model was taken and turned that model into an API endpoint so it can be used in applications that don’t speak R.

7. tidy time series analysis

Time series can be frustrating to work with, particularly when processing raw data into model-ready data. But, this work presented two new packages that address a gap in existing methodology for time series analysis (raised in rstudio:: conf 2018). Furthermore, the tsibble package supports organizing and manipulating modern time series, leveraging tidy data principles along with contextual semantics: index and key. The tsibble data structure seamlessly flows into forecasting routines. Also, the fable package is a tidy renovation of the forecast package. It promotes transparent forecasting practices and concise model representations, to empower analysts tackling a broad domain of forecasting problems. This collection of packages form the tidyverts, which facilitates a fluent and fluid workflow for analyzing time series.

8. 3D mapping, plotting, and printing with rayshade

In this talk, speakers emphasised on the creation of beautiful 3D maps and visualizations with the rayshader package. In addition, talk about the value of 3D plotting, how interactions with the R community helped drive the development of rayshader, and how writing/blogging about your projects can vastly improve your code was taken.

9. Spatial Data Science in the Tidyverse

Package sf (simple feature) and ggplot2::geom_sf have caused a fast uptake of tidy spatial data analysis by data scientists. But, they do not handle important spatial data science challenges, including raster and vector data cubes and out-of-memory datasets. Powerful methods to analyse such datasets are now put in packages stars and tidync. This talk discussed how the simple feature and tidy data frameworks can handle these challenging data types. Also, it shows how R can be used for out-of-memory spatial and spatiotemporal datasets using tidy concepts.

Workshops

1. Applied Machine Learning Workshop

This two-day workshop provided an overview of using R for supervised learning. The session stepped through the process of building, visualizing, testing and comparing models that are focused on prediction. The goal of the course was to provide a thorough workflow in R to use with different ML techniques.

2. Advanced R Markdown Workshop

 This workshop was for those who want to take their R Markdown skills to the next level. They talked about many low-level details in the rmarkdown package and the whole R Markdown ecosystem. The two goals of this workshop were: 1) learn how to fully customize R Markdown output (HTML, LaTeX/PDF, Word, and PowerPoint); and 2) learn more about existing R Markdown extensions in the ecosystem, such as flexdashboard, bookdown, blogdown, pkgdown, xaringan, rticles, and learnr.

3. Big Data with R Workshop

A two-day workshop. We will cover how to connect to and analyze data that exists outside of R. For databases, we will focus on the dplyr, DBI and odbc packages. For Big Data clusters, we will also learn how to use the sparklyr package to run models inside Spark and return the results to R. These packages enable us to use the same dplyr verbs inside R. We also will review recommendations for connection settings, security best practices and deployment options. Throughout the workshop, we will take advantage of RStudio’s professional tools such as RStudio Server Pro, the new professional data connectors, and RStudio Connect.

4. Shiny in Production | Data Products at Scale Workshop

This two-day workshop taught the best practices that make production-quality Shiny application performance possible. Does 30,000 simultaneous users for a Shiny application sound challenging? People use Shiny all over the world to deliver interactive, visual data products from data science teams to internal and external audiences at scale. The course focused on the open source and professional ecosystem around shiny that includes performance optimization, testing, and production deployment.

5. Introduction to Deep Learning Workshop

This one-day workshop introduces the essential concepts of building deep learning models with TensorFlow and Keras. During this course, you will build and train neural networks using the Keras package.

Summary

With a lot of development and growth in usage of R, advancements are necessary. It was really great to see the community moving towards the idea of production level quality in R focusing on both execution speed and efficiency. R conference 2020 holds more promise based upon the great development in the conference this year. R is sure to go beyond places!