top of page

Creating an Internal R Package: canaR


Use the same code three times, create a function.

Use the same function across multiple projects, create a package?


At CANA, we use the statistical programming language R across several projects for its ease of creating data-driven applications and reproducible reports. It’s a great option for exploring data, prototyping solutions, or even taking a model to production. This last summer, several R programmers on the CANA team collaborated on an internal package, canaR, to share standard functions and formatting across our R products. In this post, we share some lessons learned from creating an internal package.


The DRY Principle

Most programmers are familiar with the DRY principle (Don’t Repeat Yourself) which aims at reducing repetition in code. Reasons to follow this principle are abundant: less effort for the programmer, reduced chance for error across multiple uses of the same code, and streamlined testing.


In R, this translates to the use of functions and packages as a best practice. Packages are a natural unit to distribute code within a team as they include functions, tests, documentation, and vignettes. The package development process has become increasingly accessible in recent years due to tools such as usethis, testthat, and roxygen2.


Package Development Approach

The canaR development process was a collaborative effort led by CANA’s R programmers. For our internal package, functions fell into three key categories: style and branding, rounding, and visualizations.

  • Style and branding functions in the canaR package provide uniformity across our R products by creating a standard for document, table, and plot appearance. These offer a jumping off point for tailoring by analysts as they utilize the formatting functions and geoms on different projects. The canaR development team worked with CANA’s graphic designer, Koa Beam, to create custom, color blind-friendly palettes to fit categorical, sequential, and diverging types of data visualizations.

  • Rounding functions are another crucial and often underestimated challenge for standardized results. Working with repetition results of a simulation and/or collaborating with team members that use a different software can lead to some tricky rounding challenges. Sample functions in the canaR package that tackle these challenges perform actions such as aligning rounding results to what is expected in Excel, or controlling rounding for aggregations.

  • Visualization functions in canaR create unique data visuals not covered by existing packages. One of the functions that is helpful for working with dateless planning scenarios is the ‘relative timeline,’ created by CANA team member, Aaron Luprek. This data visualization function creates a time line centered around zero, which allows for showing results that are time-based but not associated with a certain date.


Sample Relative Timeline Graphic

Once the functions were built, the package development team worked together to create clear documentation, examples, and vignettes for future users. We used the testthat package to streamline our testing approach.


The final touch was the package naming convention. In the tradition of some of our favorite tidyverse packages, we saw the opportunity to incorporate a little French (ala magrittr) and an animal reference (ala purrr). Hence, canaR as a sly reference to the ‘canard,’ French for duck, and a diverting duck logo.


Resources

Interested in creating your own R package? There are many resources available! Here are just a few that our team found helpful in developing canaR:



Lucia Darrow

Is a Senior Operations Research Analyst at CANA Advisors and can be reached through her LinkedIn profile, or via

59 Comments


finch
finch
Jan 26

If you’re looking for a simple way to keep track of tasks while coding, check out crm kanban. I liked how the canaR team talked about the DRY principle and bundling style and branding functions into one package. It really helps keep everything consistent across projects.

Like

Mình có lần lướt đọc mấy trao đổi trên mạng thì thấy có người để link https://789winf.com/ trong lúc câu chuyện đang nói dở. Mình mở ra xem qua trong thời gian ngắn, chủ yếu để nhìn bố cục và cách sắp xếp nội dung tổng thể, cảm giác khá gọn, đọc lướt cũng không bị rối. Với mình thì xem nhanh như vậy là đủ để nắm thông tin cơ bản rồi.

Like

Mình có lần lướt đọc mấy trao đổi trên mạng thì thấy có người nhắc tới e2bet vip trong lúc câu chuyện đang nói dở. Mình mở ra xem qua trong thời gian ngắn, chủ yếu để nhìn bố cục và cách sắp xếp nội dung tổng thể, cảm giác khá gọn, đọc lướt cũng không bị rối. Với mình thì xem nhanh như vậy là đủ để nắm thông tin cơ bản rồi.

Like

Mình có lần lướt đọc mấy bình luận trên mạng thì thấy có người nhắc tới kubet trong lúc câu chuyện đang nói dở. Mình mở ra xem qua trong thời gian ngắn, chủ yếu để nhìn bố cục và cách sắp xếp nội dung tổng thể. Nhìn nhanh thấy trình bày khá gọn, không bị rối nên mình đọc lướt một chút rồi quay lại xem các bình luận khác.

Like

Tim
Tim
Jan 01

This post on creating an internal R package sounds super useful, especially the part about the DRY principle. I'm always trying to avoid repeating myself in code, and using functions and packages makes so much sense for that. It's cool they have tools like usethis to make it easier. If anyone else is working on similar projects, they might find this Concrete Slab calculator helpful for planning out their project needs too!

Like
bottom of page