Back to articles
RIDE - A New Data Science IDE for Python and R
by in

The data science world is split into two parts: the (i)Python and the R community. Both groups offer a plethora of tools and libraries enriching our work-life as a data scientist.

Interestingly, many of the offerings are complementary, such that professional data scientists should know both environments to pick the right tool for the job. In many cases, it even makes sense to use Python and R together in the same project.

Sadly, today these two worlds don’t integrate very well, so we need to switch back and forth between different tools and environments.

Introducing RIDE

RIDE is a new development environment for data science. It aims to be the cockpit for professional data scientists working in multiple languages. By leveraging and extending the awesome JupyterLab, RIDE combines advanced tool support with the interactivity of Jupyter notebooks.

Interactivity

We are firm believers in instant feedback and quick development turnarounds. RIDE provides feedback to the user all the time. This starts with features like intellisense and diagnostics that update as you type and goes much further.

(Inline) Sourcing

In code editors, you can send code of current line for evaluation to the active session. RIDE's flexible layouts allows you to see editors and consoles side-by-side, like in the following example.

line-sourcing.gif

In case, you do not want to waste screen estate with a console, you can enable "inline sourcing". Doing so will render the results right below the statement in the editor:

inline-sourcing.gif

Furthermore, you can source the whole file explicitly or automatically on save.

Notebooks, RMarkdown, Shiny and More

Jupyter Notebooks are natively supported because RIDE directly leverages the new awesome JupyterLab. Additionally, RIDE can handle other presentation formats very well. For instance, RMarkdown can be auto-processed on save.

rmd.gif

Of course, you can use the "inline sourcing" feature in *.rmd files as well.

Furthermore, users can start, stop, and refresh Shiny apps within an editor containing a app.

shiny-app.gif

Language Support

A data model ultimately is a piece of software written in one or more programming languages. Such projects can quickly grow and become complex, such that traditional software engineering practices like testing and debugging become necessary. Why should data scientists not get the same kind of advanced language and tool support, that regular software developers have today?

Language support in RIDE goes beyond what the usual data science tools such as RStudio or StatET offer today. We get our inspiration from the best tools in software like JetBrain's IntelliJ IDEA or the Java support in Eclipse.

RIDE supports the new "Language Server Protocol" (LSP), through which you get many useful features in code editors, consoles, and notebooks. Such features are

  • intellisense

  • diagnostics

  • navigation

  • hovers

  • and many more

Supporting such features in a dynamically typed language such as Python or R is, of course, a bit more challenging than with a statically-typed language such as TypeScript, Scala or Java. Traditionally such tool support is provided by a compiler that parses the source code and collects information through type systems and static analysis. For R we have implemented such a language server, but additionally, we combine it with information from a running kernel.

Debugging

When working in a kernel session, the user wants to see what values are available. RIDE offers an environment view where you can navigate through the current scope and inspect any values.

environment-view.gif

Furthermore, you can debug your code by setting breakpoints and stepping through it. Again RIDE's support for that is not tied to R but is based on a generic kernel extension and will soon be supported for Python and other languages, too.

debugging-function.gif

Please see for a more detailed description and comparison of RIDE's debug support.

Data Viewer

Data Science is all about data. So of course, we need to be able to have a glimpse at it now and then. The challenge here is that often we process large amounts of data and we need to be careful not to inflate the memory footprint unnecessarily. Since RIDE is a cloud service, you could scale up your workspace if needed but still we cannot send all data over the network. Moreover, all the available memory should be used in meaningful ways and not wasted carelessly.

Therefore RIDE's data viewer will only fetch the data it needs to present. In fact, it even allows looking at an infinite stream of data. Since the data viewer directly connects to the kernel through a protocol extension, no unnecessary copying of data happens.

data-viewer.gif

Support for Python

All these features for explained for R such as debugging and data view are also provided for Python 3. Our goal is to provide all features and supports for both Python and R, making RIDE a complete data science IDE for model developers.

SQL Kernel

An SQL Kernel is also implemented using the same protocols to support database connection in the same environment.  

Next Steps

We will keep working closely with the awesome JupyterLab team, who were not only very open to our (sometimes quite extensive) pull-request but also generally always super friendly and supportive in all kinds of ways. We are currently looking into making even more of the things we did open-source. For instance, it would make sense to open up the kernel extensions we have defined, to allow third parties to use them as well.

If you got interested in trying out RIDE, you can create a free account and start using it now.

Back to articles
Try R-Brain for Free

R-Brain is a powerful data science platform, where you can build sophisticated models, collaborate with others, learn and experiment. Try for free, no credit card required.
Try for Free