Back to articles
RIDE Server - A Multi-User and Scalable Data Science IDE Server
by in

Moving to a server-based infrastructure necessitates choosing a platform that is capable of fully utilizing all the benefits of the container technologies and software development solutions. This whitepaper illustrates how RIDE Server meets all key attributes required for a server-based IDE. Powered by Jupyter and Eclipse Che, RIDE Server delivers a unique, smart development platform ready for enterprises to manage development and workloads required by data science teams at any scale. Moreover, using RIDE Server, enterprises save significantly by purchasing a single license for all languages and frameworks they need and avoid the complexity of implementation and integrations of multiple tools into their systems.

INTRODUCTION

Model development’s migration to the server-based systems is rooted in two longstanding, competing interests within IT departments: those of IT administrators and data scientists. The former favor stability, security and control, while the latter demand their choice of languages, frameworks and processes. These differences result in tension over who controls development servers and the choice of programming standards, with developers favoring microservices adapted for each scenario and IT preferring inflexible templates based upon tried and tested configurations.

A popular alternative is central servers managed by IT, usually on VMs, but adoption of VM-based solutions have started to diminish as VMs are large, difficult to share, expensive and not conducive to collaboration.

Data scientists have a tendency to control their codes and computation resources — unless they are given a way to collaborate and share their work in a secure system which is now reachable thanks to the growth of server-based solutions.

Server-based development

By the rise of container technology (e.g. Docker) the entire development process — including workspaces and their runtimes — can now be hosted in a centralized server system enabling developers to move away from traditional desktop environments. In this context, browser-based IDEs offer the ideal separation that allows IT to retain root control of the system while developers can use Docker and other tools to define their programming stacks.

This momentum is also evidenced by the growing adoption of server-based development environments. While RStudio intends to offer its own server-based development system, Anaconda (Formerly Continuum Analytics) avoid the rigidity of a closed system by opting for open-source projects such as Jupyter Lab.

WHY RIDE Server?

Despite the tendency of all players in the data science solution industry towards server-based solutions, they are far from providing a total solution for model developers. Here are a few examples:  

 

Single Language System:

The Challenge: In contrast to software development IDEs such as Eclipse, data science IDE providers are concentrated on a single language serving the others as a second-class citizen in their systems. For instance, RStudio serves R only while Jupyter is built for Python.

RIDE Server Solution: RIDE IDE supports popular languages in data science such as R, Python and SQL with full features such as content assistant, environment and data view, and debugger. It is also extensible through its Language Support Protocol and kernel-base structure. 

 

Missing Development Frameworks and Features:

The Challenge: Support for different development frameworks and advanced features is limited in other solutions. For instance, data scientists love to work in notebooks. As a result, RStudio aims to provide a notebook-like solution in its RMarkdown framework, Jupyter notebooks do not support features such as environment view and debugger.  

RIDE Server Solution: RIDE Server supports RMarkdown, notebooks and text editor for R and Python. Thanks to its in-house R kernel technology, R specific features such as Shiny are supported as well. Whether users like to develop in a notebook or a professional text editor, or develop in R or Python, they enjoy all available features of a data science IDE such as data and environment view, debugger and database connection system.    

 

Traditional Load Balancing Systems:

The Challenge: Despite the delivery of a multi-user system by server-based solutions such as JupyterHub or RStudion Server, the heart of the technology is still based on traditional load balancing systems which are not suitable for serving different applications with unpredictable resource requirement and stability. As a result, these systems are always at the risk of crashing or instability when a developer consumes most of the resources or his/her system crashes.

RIDE Server Solution: Using container technology (Docker), every user is in a secure and isolated environment with predefined available resources and dependencies. As a result, the system provides a stable distribution of resources among users without any risk of crashing of the whole system.

 

System Administration:

The Challenge: High-level management of users such as adding / removing users or defining limitations available resources needs a decent knowledge of operating systems. 

RIDE Server solution: Most of the high-level controls are available through a dashboard for the system administrators enabling system management by mouse clicks without any operating system interaction.

 

Compatibility and System management:

The challenge: Most of the server-based systems are dependent on a specific operating system or distribution. Also, the maintenance, backing up or upgrading is usually challenging and needs ann extra attention.

RIDE Server solution: RIDE Server runs anywhere that Docker runs. For example, it runs on Mac, almost all Linux distribution or even on Windows systems. Installation or upgrading of RIDE Server is a single command entry and all workspaces, specifications and user information are stored in a single directory for easy backup and system maintenance.

 

RIDE Server Technology

RIDE Server is a unique collection of the best technologies in container management, data science tools, and IDE systems. It has two major elements: The IDE and the Server system. 

Server System:

In the server system, RIDE has leveraged Eclipse Che, the state of the art technology from Redhat for managing development stacks and container orchestration systems. Thanks to this technology, RIDE Server is a scalable and secure multi-user and multi-tenant environment for serving thousands of users. The extra security features such as SSL communication as well as running the system behind a firewall are provided. RIDE Server is delivered with 4 prebuilt development stacks of plain vanilla, Data Science, Tensorflow, and Spark. Organizations can develop their own development stack or inquire for a custom designed recipe from R-Brain.

IDE System:

R-Brain IDE is a combination of state of the art technologies in data science and software development systems. The heart of RIDE is Jupyter Lab with extended extra features and enhancements:

  • R Kernel: The R Kernel in RIDE is different from classic Jupyter R-Kernel. RIDE kernel is significantly faster and optimized for extra features such as Shiny and debugger.

  • Text Editor: Monaco text editor which is the heart of VSCode is integrated with RIDE. Thanks to this advanced editor, users enjoy the extra features of code formatting, hovering, signature help and context menu in both editors and notebooks.

  • Xtext and LSP technology: IntelliSense in RIDE is delivered by a combination of Xtext and Language Support Protocol technologies providing an extremely fast and extensible language support environment.

 

 

 

 

 

 

 

 

 

 

CONCLUSION

Moving to a server-based infrastructure necessitates choosing a platform that is capable of fully utilizing all the benefits of the container technologies and software development solutions. This whitepaper illustrates how RIDE Server meets all key attributes required for a server-based IDE. Powered by Jupyter and Eclipse Che, RIDE Server delivers a unique, smart development platform ready for enterprises to manage development and workloads required by data science teams at any scale. Moreover, using RIDE Server, enterprises save significantly by purchasing a single license for all languages and frameworks they need and avoid the complexity of implementation and integrations of multiple tools into their systems.

 

 

 

Back to articles
Try R-Brain for Free

R-Brain is a powerful data science platform, where you can build sophisticated models, collaborate with others, learn and experiment. Try for free, no credit card required.
Try for Free