RvsPython #6: LinkedIn has spoken!

Articles From: bensstats
Website: bensstats

Introduction

Over the past while with my time on LinkedIn, I got to have exposure to many people from many different lines of work. I also managed to have carved a space for myself there where I can post about Data Science topics and share my blogs along the way. There have always been posts and polls comparing R and Python as well as the subsequent debates among users of the languages as far as which one is superior for doing Data Science. While these sort of arguments will never end and I am far from innocent of engaging in them, I chose to take to task understanding why Data Science practioners preferred one language over another by “controlling” for exposure to the other language.

In this blog I am going to share my results from my LinkedIn polls comparing respondents preferences. The polls asked for respondents preferences for:

  1. Using dplyr in R vs pandas in Python for data wrangling,
  2. Using ggplot2 in R vs matplotlib and seaborn in Python for data visualization, and
  3. Using Jupyter notebooks vs RMarkdown for writing reports.

Disclaimer

This is by no means a formal study, its more of just me sharing my findings in blog form. Social media platforms come and go, but having a blog where I can share my findings (albeit less popular) offers a place where I can post my curated content. Likely due to LinkedIn’s algorithms, my first and second questions got more traction with over 132,000 views combined and over 1600 and 1300 votes respectively, while my last question only got a little more than 4000 views and over 106 votes at the time of writing.

To quote a comment on one of my polls:

With this in mind, lets share the results of these polls.

(Visuals were made with ggplot2 and the ggtech package for the theme)

1. dplyr vs pandas

As expected, most users who were pro-pandas never used dplyr before. However, when controlling for prior experience, it was pretty much a 50-50 split among respondents between using pandas in Python and dplyr in R. There were some comments recommending that I check out the data.table and dtplyr packages in R; while I don’t have much exposure to using those packages presently, I hope to check them out in the future.

For my closest experience to dplyr in Python, check out my review on the siuba module.

2. ggplot2 vs matplotlib and seaborn

In the case of comparing ggplot2 to matplotlib and seaborn among users who had experience with both packages, ggplot2 is preferred by 56% of users. Most users of matplotlib and seaborn don’t have experience with ggplot2 and vice-versa.

I was told to check out the plotly library which is compatible in R and Python and it really looks like a great library to have for building interactive dashboards and applications. While I don’t have much experience with it now, I do hope to check it out when time allows for it.

3. Using Jupyter notebooks vs RMarkdown for writing reports.

The results from this poll are questionable as I only got 106 replies to this poll. With this in mind these are the results:

Of users with experience with using both RMarkdown and Jupyter notebooks for writing their reports, 63% of users prefer using RMarkdown over Jupyter notebooks, however there are more users who have experienced Jupyter notebooks than RMarkdown.

Conclusion

With all being said, using dplyr in R or pandas Python for doing data wrangling seems like a toss up among users with experience with both languages. For data visualization, ggplot2 seems to be preferred over matplotlib or seaborn and if you trust the sample size, RMarkdown is preferred over Jupyter notebook among users with experience with both.

In general, apparent that R is still the underdog in terms of it being a language used for Data Science and programming- but by no means does that make me intend on stopping from using it any time soon.

When I get the time, I look forward to giving data.table and plotly a spin!

Thank you for reading!

Visit Bensstats Blog for additional insight on this topic and subscribe to his newsletter: https://bensstats.wordpress.com/category/rvspython/.

Disclosure: bensstats

All investments carry a certain degree of risk, including the possible loss of principal. There is no assurance that an investment will provide positive performance over any period of time. There are specific risks that apply to investment strategies. These risks should be reviewed carefully before taking any investment action. Since no one investment style or manager is suitable for all types of investors, this commentary is provided for informational purposes only. The statements contained herein are the opinions of Rareview Capital LLC. All opinions and views constitute our judgments as of the date of writing and are subject to change at any time without notice. This commen contains no investment advice or recommendations. Individual investor results will vary. Past performance is no guarantee of future results.

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from bensstats and is being posted with its permission. The views expressed in this material are solely those of the author and/or bensstats and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.