
Search Results
232 results found with an empty search
- Jon’s Mission
*Event Photography proved by Walter Golf Photography Today, on September 10th, 2024, World Suicide Prevention Day, we are reminded of the critical importance of supporting those affected by mental health struggles, especially among our veterans. This year marks the seventh year of our dedicated efforts in raising awareness and support for veteran suicide prevention, a cause that remains deeply meaningful to us at CANA Foundation. Our commitment to giving back to the community is reflected in our ongoing partnership with CANA's Principal Information Systems Security Manager, Kurt Eades and Jon’s Mission for 22. In the blog post below, Kurt shares his personal experience at the annual Jon’s Mission for 22 event back in 2019, highlighting how community support plays a vital role in this fight. From honoring veterans who have lost their battle with PTSD to raising funds for impactful organizations like Mission 22, Project Zero, and the Tragedy Assistance Program for Survivors (TAPS), each effort at this event underscores our collective responsibility to support those who have served our country. On this World Suicide Prevention Day, let us come together to honor those who have served and continue to fight their own battles, and remember that by standing united, we can make a meaningful difference. In observance of this Veterans Day and founded on the principle of “ Giving Back ” to our communities, the CANA Foundation recently had the privilege to support our very own Kurt Eades and Jon’s Mission for 22 in a fundraising event to raise awareness and support for veteran suicide. As a company with many veterans in its ranks, the issue of veteran suicide is “near and dear to CANA’s heart” and supporting Kurt and Jon’s Mission for 22 was a wonderful way to give back and to raise awareness for this epidemic affecting so many veterans and their families. Kurt shares his personal account of the event below – I have been a friend of the Busbin Family for nearly 5 years, and I have been a sounding board and dedicated supporter of the Veteran Suicide Awareness battle alongside Fred and Laura Busbin since their son, Jon, lost his battle on Oct 14, 2017. This is the 2nd Annual Jon's Mission for 22 event. The event brings people, fighting a common illness, together to support those who need the support. In doing this, Fred and Laura Busbin and their family continue to celebrate the life of their son who lost his battle with Post Traumatic Stress Disorder (PTSD). Mission 22 ( www.mission22.com ) and Project Zero ( www.facebook.com/projectzerovets ) are organizations that are working tirelessly to stop this epidemic, and they are like no other organizations we have found. They truly put the donated money where it is needed - to the military service members! Tragedy Assistance Program For Survivors ( TAPS ) ( www.taps.org ) is an organization that supports the families left behind after any veteran death, and they provide much needed support to families in their most difficult times. ALL the money raised during this event will be divided between these three organizations. On Friday, October 11th, 2019, I was part of a group who set up a small trail for the event, Trail for 22 . This trail honored 50 veterans who lost their battle with PTSD, and the thousands who are currently suffering. Included in the named ribbons on each tree was one soldier from Canada, exemplifying the fact this is not just a U.S. illness - this is a worldwide illness. Each ribbon was adorned with an American Flag and Canadian Flags for the one Canadian Soldier. Flags we also placed in the ground in front of each tree. On Saturday, October 12th, during the actual 2nd Annual Jon's Mission for 22 event, my role was trail guide, videographer using CANA CDS01 Drone and my personal Jeep with a GoPro, safety, and generally anything that was asked of me. The hours I spent with the event-goers, volunteers and Jeepers, talking as we setup the event or while we were tearing down after the event - was priceless. We shared stories about friends, family and brothers and sisters in arms who are or were fighting this battle. Two veterans participating in the event were chosen to receive a set of special "22" Hoodskulls for their Jeeps. These skulls are 3D-printed, and the skull wears a helmet with 22 on the front and a "B" for Busbin on the back. The event drew in approximately 250 Jeepers, 4-wheelers, SidebySides and motorcycles to the Possum Creek Offroad Park in Ray City, Georgia. We also had 70 items that were raffled off to raise additional donations. The Opening Ceremony was performed by The Honor and Rifle Guards of the American Legions in Valdosta, Sylvester and Albany, GA. The Color Guard was from the Worth County, Georgia High School JROTC. The event raised $4,355 on Saturday from entry fees, and another $5,718 from raffle tickets - bringing the total amount raised to $10,073! People came in for the event from as far away as Frederick, Maryland and the Florida Keys. To learn more, visit Jon's Mission for 22, Inc. at jonsmissionfor22.com . CANA Advisors sends a heartfelt “thank you” to all our veterans as well as those men and women serving in uniform today. Wishing all a safe, reflective, and happy Veterans Day! #PTSD #suicideawareness #veterans #CANAFoundation #KurtEades #JonsMission #Mission22
- CANA Foundation Supports Randolph-Macon Academy’s Leadership Symposium
CANA’s Terry Hagen recently provided leadership and facilitation of a group discussion on data analytics at the Randolph-Macon Academy in Front Royal, Virginia. Randolph-Macon Academy students, along with students from five additional Northern Virginia schools, interacted and exchanged ideas in a day long Leadership Symposium. A variety of industry experts provided overviews and case studies in the areas of: Health Sciences, Military, Cyber/Computer Science, Aviation/(Drones), Data Analytics, Engineering, and Non-Profit. The themes of corporate challenges and ethics, organizational corporate culture, and effective leadership were highlighted and culminated in break-out group presentations. The following blog post provides more details on the event: https://www.rma.edu/blog/leadership-symposium-r-ma-hailed-success CANA Foundation looks forward to continuing supporting Randolph-Macon Academy in their emerging innovations lab and providing expertise in analytics and unmanned aerial systems curriculum development. Terry Hagen is a Principal Logistics Analyst at CANA Advisors. Contact him at thagen@canallc.com #CANAFoundation #outreach #leadership #RandolphMaconAcademy
- Spring, NCAA Basketball, and Sports Analytics Podcasts are in the Air!
If you want to learn more about sports analytics use your favorite podcast listening app to download the March INFORMS Resoundingly Human. CANA Advisor's very own Director of Analytics Capabilities, Walt DeGrange, is one of the featured guests. He highlights both surprising examples of analytics in sports and how CANA has helped professional teams use analytics to succeed. The episode also has tips on winning your NCAA Tournament pool with analytics and creating the Major League Baseball schedule with optimization. https://informs.libsyn.com/resoundingly-human-march-2019 Walt DeGrange is the Director of Analytics Capabilities at CANA Advisors wdegrange@canallc.com #podcast #INFORMSSpORts #sportsanalytics #informs #WaltDeGrange
- Reflections on rstudio::conf 2019
At CANA Advisors, we use R daily, for everything from exploratory analysis, to generating publication-quality documents with R Markdown and interactive web apps with Shiny. In January, I made the trek down to Austin, TX to attend the RStudio conference. This was the largest RStudio conference to date, buzzing with nearly 2,000 attendees. Here a few reflections from those two enlightening days. R in Production: Not only possible, but easy! The first keynote speaker, Joe Cheng, broke down how the RStudio team is working to make R in production effortless for R programmers with little to no experience in web development. New tools like Shiny load test, plot caching, and async are now available to supplement old standbys RStudio Connect and Profvis. Throughout his presentation, Cheng stressed the importance of addressing the cultural and organizational barriers to scaling R. The ability to swiftly take analysis from an exploration in R to production presents a new role for data scientists, one that must be taken mindfully, respecting and relying on the expertise of IT and engineering teammates. The theme of production carried throughout the conference, with several presentations on the topic. Of note was a presentation from a team at T-Mobile, who shared a familiar story with the audience: presenting a Shiny application to high-level leadership gave credibility to their project, sparked interest, and eventually earned them additional resources to continue their work. From there, their engineering and data science teams worked together to put a Keras model into production, which is now responding to customer requests in real time. This is one of the most robust R models I’ve seen in terms of scale— seeing how they overcame technical and cultural barriers to create a fast, compact app was incredibly valuable. Defining Data Science In the second keynote talk of the conference, Felienne Herman, shared her joys and challenges in studying how people learn programming languages. Although this may be changing for the next generation, the majority of practicing data scientists do not have a shared memory of what it looks like to learn the tools of our trade. In part, this may be because data science programs didn’t exist until a few years ago, but the issue is even more fundamental. The view that learning how to code is exploratory, done individually with great struggle, is very common, because that is how so many of us learned how to code. What if, as with math or reading, there is worth in structured education for this line of work? Data science is supposedly the “sexiest job of the 21st century.” Despite all of the hype surrounding this career, amongst the analytics community, there is little consensus on what actually defines a data scientist, apart from a high salary. As operations research (OR) analysts, we see great overlap between our two disciplines. OR can certainly be considered a necessary predecessor of data science. While at the RStudio conference, I was surprised to see how many presenters and conference attendees, from a variety of disciplines, identified as data scientists. To an extent, it feels as if “data scientist” is an attribute that we can append to those that have mastered modern analytics techniques, whether they be biologists, geographers, or operations research analysts. Angela Bassa gave an inspiring talk about growing data science teams and taking advantage of the unique strengths individuals may bring to an organization. A panel discussion with industry leaders also addressed this topic with a conversation about how to manage a diverse set of team members. Strength in Community As with most R events, Hadley Wickham opened the conference with a Code of Conduct, reinforcing the importance of inclusivity and diversity. The “Pacman Rule” of always leaving space for a new face to join a conversation was honored throughout the conference. On the final evening, I connected with other RLadies at a social event. I picked up great tips on successful events from folks at different chapters from around the globe to take back to my own chapter. This event continued a theme of openness and inclusivity felt throughout the conference. PC: JD Long Interested in learning more? Many of these talks are available online. Check out https://resources.rstudio.com/rstudio-conf-2019 to experience the conference. Lucia Darrow is an Operations Research Analyst at CANA Advisors. To find more content on our favorite professional events, continue to visit our CANA Connection. #RStudio #shiny #RLadies #LuciaDarrow
- Tech Meetups: Working Remotely, Connecting Locally
Analytics is a rapidly evolving discipline. Staying up to date with the latest methods and software can be a daunting task to undertake alone. Local meetups provide access to a community of professionals with similar goals. In particular, Meetup.com is one site that supports self-organizing groups, which generate events based on a common interest. In the tech space, these can range from networking events to book clubs and hackathons. In the D.C. area alone, there are over 100 meetup groups related to analytics and data science. In this post, three CANA team members, Lucia, Walt, and Jerome, discuss how they use meetups to learn, network, and give back to their communities. Lucia Darrow | I use meetups primarily to learn. One of my favorite data science meetups in the Vancouver area hosts monthly briefs based on the online competition website Kaggle. Local teams present a Kaggle competition, share their approach, and then explain the methods of the highest-ranking teams. These presentations give me a chance to think about how to handle diverse problems, often with different data types than I work with in my day-to-day tasks. The combination of algorithms crafted by the winning teams are fascinating! One of my top takeaways is the relative success a simple method can have against these highly complex solutions. Another way I use meetups is to connect with other women and gender minorities in tech, through groups like RLadies and PyLadies, that aim to give minorities a voice in tech communities. RLadies is an organization promoting gender diversity in the R community with meetup groups in 131 cities worldwide. You can check out a map of current locations here. For me, this group provides a bridge to the larger R community and connection to wide range of experienced R users. Hadley Wickham speaking at an RLadies Meetup, August 2018 Walt DeGrange | One of the great things about living in the Research Triangle is the abundance of tech companies. SAS, RedHat, Lenovo, Cisco, IBM, and many more call the Triangle home. It also has three major universities with the University of North Carolina, Duke, and NC State and lots of smaller schools. Given this abundance of analytics, the area boasts many Meetup.com groups. There are groups for languages like Python, R, SAS, and Julia and techniques such as machine learning and AI. There are even groups specific to certain genders like the R-Ladies RTP. In May 2018, the Research Triangle Analytics group ask me to speak. The venue was the SAS Training Center in Cary, NC. The facility was state-of-the-art and had more screens than I had ever seen in one room. I shared three stories from my past that focused on challenges in implementing analytics solutions. There was an excellent discussion after my presentation and several analysts shared their personal experiences. "I love interacting with this diverse group of analytical professionals. The meetups give someone that works at home an opportunity to interact and share in person with other analysts. It also allows me to see what cutting edge in other industries. The cost of getting started is a little time to log into Meetup.com and search for analytics. As an additional bonus, many of the meetings are sponsored by organizations that supply pizza for the evening gatherings." - Walt DeGrange Jerome Dixon I I use meetups to learn, network, and collaborate where it fits with like projects or technologies I’m interested in. My focus is machine learning, becoming a better programmer, and keeping up with the myriad of tools and techniques constantly getting introduced into the data science and technology space. One of the benefits to living in Richmond are the tech companies that sponsor meetups and examples of the use cases that they have actually put into production. I’ve seen amazing presentations by CapitalOne for how they use machine learning and hints into their data engineering infrastructure. I’ve sat through a very novel idea and use case for Tableau Server as basically an extract, transform, and load (ETL) tool by CarMax. Ippon has hosted some great meetups with how they are using Apache Spark as well as some best practices for their IT project management. Below picture is from NVIDIA’s Data Scientist, May Casterline, on the work they are doing with image processing, deep learning networks, and GPU dataframes. Leading edge technology! And when I get stuck - PyRVA (python user group), RVA R User’s Group (R statistical programming language), RVA Linux User Group (Linux, Amazon Web Services), and Docker Richmond meetup group (for containerization). Great opportunities for both support and networking with Richmond’s local meetup community. I am very fortunate and grateful for the city I live. If you are lucky enough to be located in a great meetup community - please leverage! *CANA Advisors is a veteran and woman owned leading logistics and analytics agency based out of Gainesville VA USA. For more information about CANA Advisors and its world class team visit canaadvisors.com #LuciaDarrow #WaltDeGrange #JeromeDixon #techmeetup #meetupcom #kaggle #RLadies #PyLadies #python #R #NVIDIA
- Why the BOM Gives Us a Headache
One of our DoD clients inquired on where they could apply machine learning to improve their repair processes. What makes this process challenging is that repair personnel maintain a list of parts required to do a specific repair task. If for example, your car needs an oil change, then the list of parts for the maintenance would include one oil filter and a few quarts of oil. If the list of parts were always the same, then determining the list would be trivial. Unfortunately, there is always the possibility of performing scheduled maintenance and finding other parts that require repair. Also, there is the additional complexity of having parts replaced by newer parts. When you scale this up to a fleet of thousands of vehicles and tens of thousands of repairs per year, determining the required part list from maintenance personnel experience alone is a daunting task. We recommended an innovative approach using natural language processing (NLP) for creating a Bill of Material (BOM). What is a Bill of Material (BOM)? A Bill of Material (BOM) is a breakdown of items with associated configuration data needed to repair a subassembly or component of a larger system. Usually, these are broken up into a manufacturing domain and a supply chain domain, but there may be additional domains depending on the complexity of the part and the processes needed to repair or procure. Figure 1: The BOM connects the critical domains for a system's supportability Oleg Shilovitsky has written several blog posts and presentations on the challenges of BOM management. His presentation provides a backdrop on the issues that make BOM management challenging. http://beyondplm.com/2016/03/08/pi-munich-presentation-develop-single-bom-strategy-3/ What drives our analysis? Operational requirements drive how many vehicles are ‘Awaiting Repair’. Accurate onhand inventory (BOM Forecast) and ‘Repair Rate’ drive the number of ‘Repaired Vehicles’ delivered. Figure 2 is a high-level depiction of our repair process. Figure 2: High-Level Repair Process We first match the number of ‘Repaired Vehicles’ to the ‘Repair Type’ performed over a set period of time. Our set period of time is typically the previous repair schedule. After we match the repair types completed to the number of issues per vehicle, we then calculate the frequencies of the parts ordered. Our final step is to divide the frequency counts by the number of vehicles per period to get a Replacement Rate (RR) per individual part. This RR with a part number now becomes our forecasted Bill of Material (BOM). Natural Language Processing (NLP) Refresher A term document matrix is a way of representing the words in the text as a table (or matrix) of numbers. The analysis uses the rows of the matrix to represent the text responses, and the columns of the matrix to represent the words from the text. Once in a term document matrix format, we can apply various text-mining algorithms. Figure 3 provides an example of this process. Figure 3: NLP Bag of Words Part numbers or National Item Identification Numbers (NIINs) can be represented as words or character vectors for text mining analysis. By representing this as a text mining problem, we gain efficiencies in computer memory utilization in addition to additional analysis methods. Job Order Number (JON) refers to the type of repair action and the required parts and procedures required for repairs. Control Order Number (CON) refers to the higher level JON that a major repair action falls under. CONs and JONs will be our “bags” for collecting words (NIINs). Figure 4 shows a high-level view of how JONs (sentences) are made up of NIINs (words). Figure 4: Parts Data in Array Format BOM Building Example We use R to create the required data structure and perform the frequency analysis. We format our issue data into a list array by Repair Type: Figure 5: Issue Data in List Array Data Structure Figure 6: Issue Data – expanded We then convert our list array to a document term matrix, count frequencies, and calculate the individual Replacement Rates. Figure 7: Final BOM BOM Building Made Easy Here we have shown a relatively easy, data driven approach to developing a BOM. Our goal is to reduce the workload of the repair personnel to perform BOM maintenance and create a more proactive approach to BOM management. By also monitoring repair rate trends from period to period, maintainer or senior management can identify missed configuration changes or possible changes to local repair procedures. These techniques can be applied to the manufacturing side as well. This article was written and developed by the creative team of analysts at CANA Advisors. Jerome Dixon is a Senior Operations Research Analyst at CANA Advisors jdixon@canallc.com Aaron Luprek is Senior Software Developer at CANA Advisors aluprek@canallc.com Walt DeGrange is the Director of Analytics Capabilities at CANA Advisors wdegrange@canallc.com #NLP #DoD #BOM #billofmaterial #naturallanguageprocessing #JON #jobordernumber #JeromeDixon #WaltDeGrange #AaronLuprek
- 2018 86th MORS Symposium
CANA at the 86th MORS Symposium Professional societies are a way to network, share knowledge and techniques, and move the profession forward. CANA Advisors contributed in a big way to this goal during the 86th MORS Symposium in Monterey, CA at the Naval Postgraduate School. CANA's Norm Reitter, Lucia Darrow, Carol DeZwarte, and Walt DeGrange CANA's Carol DeZwarte continued to lead interesting and well attended sessions in Working Group 17: Logistics, Reliability and Maintainability as a co-chair. She will be fleeting up as the chair of the working group next year. This group is considered the home working group for many CANA analysts who have attended and briefed much of their work there over the past years. There were many great briefs covering everything from optimizing inventory to how to develop and deploy complex analytical models. Two sessions covering how CANA used R to capture inputs and present outputs for a large discrete event simulation attracted huge audiences. CANA's Lucia Darrow did an excellent job discussing the tech behind the implementation and emphasized the importance of deliberate design. She presented "Force Closure Model (FCM): Decision Support Tool Orchestration in R" in Working Group 10: Joint Campaign Analysis and "On-Demand Custom Analytics in R" in Distributed Working Group: Emerging Operations Research. Lucia Darrow of CANA Advisors presenting at the 86th MORSS CANA's Walt DeGrange participated as a panel member in a standing room only ethics in analytics discussion. The session discussed the special responsibility that each analyst has to represent their unbiased mathematical model to the best of their ability. There were also several ethical dilemmas that were posed by the audience for the panel to discuss. CANA's Norm Reitter chaired a meeting of the new MORS Logistics Community of Practice. Close to 30 MORS participants attended to discuss the latest issues and possible solutions within the National Security community. This symposium also saw the departure of Norm and Walt from the Board of Directors. Norm finished up six years on the board serving as MORS President and finishing up as Past President this year. Walt finished out his four year term on the Board of Directors as the Vice President of Professional Development. Both will remain very active in teaching for the MORS Certificate Program (MCP), and Norm will help future board members in his role as an Advisory Director. Overall, a great week of networking, learning, and collaborating with the Military Operations Research community. #86MORSS #CANAAdvisors #NPS #MORS #2018 #MORS #symposium #CarolDeZwarte #WaltDeGrange #NormReitter #LuciaDarrow #86MORSS #NPS #R
- A CANA Congratulations!
It is our pleasure to announce the promotion of Norm Reitter to Chief Analytics Officer/Senior Vice President of Analytics Operations at CANA Advisors! Norm has served successfully as CANA’s Director of Analytics since January 2014. Since 2014, he has dedicated himself to building and managing a diverse, dynamic team of operations research analysts, software developers, statisticians, graphic artists, and subject matter experts who together provide innovative and “usuable” solutions to CANA’s commercial and governmental clients. Norm has distinguished himself as a key advisor to CANA during this time – providing insights and input into the company’s strategic growth and market expansion. As he takes on this new executive dual role, Norm will develop and manage CANA’s Information Technology (IT) and Independent Research and Development (IRAD) programs, advise on future analytic investments and offerings, and lead CANA’s rapidly growing Analytics Operations services line. Norm boasts over 25 years of military and commercial experience providing logistics & analytics expertise and solutions. He holds an undergraduate degree from the U.S. Naval Academy and a graduate degree in Operations Research from the Naval Post Graduate School in Monterey, California. He currently serves in leadership roles in several professional analytics organizations. He is the Immediate Past President of the Military Operations Research Society (MORS) and the chair of the Analytics Capability Evaluation (ACE) Sub Committee with the Institute for Operations Research and the Management Sciences (INFORMS). He has three highly accomplished children – Summer (currently working towards a PhD in Psychology at Indiana University of Pennsylvania), Madison (a senior graduating this June and attending Chatham University in the fall pursing a degree in Sustainability), and Josh (entering his senior year in high school this fall). When he is not leading all things Analytics at CANA Advisors and raising three amazing young citizens, Norm is snow shoeing, hiking and paddling in the mountains and lakes of Colorado. Please join us in congratulating and welcoming Norm to this new position! #congratulations #promotion #CANAAdvisors #NormReitter
- CANA Members “2017 Give Back Day”
CANA Advisors through its CANA Foundation – supports our people and offers opportunities to ‘give back’ in many ways. One specific form of support this past 2017 holiday season was to give our team members company time to spend “volunteering in their local community.” To quote our company’s Founder and President, Rob Cranston, the CANA Foundation “provides the CANA family of employees an opportunity to connect with and give back to community areas we feel passionate and care about.” Below are a few stories of how our team members chose to ‘give back’ using this time. Bicycles for Monterey Principal Operations Research Analyst Harrison Schramm used his volunteer time in support of a project with Monterey County Behavioral and Mental Health – procuring and providing bicycles for kids in need. This project started several years ago in a casual conversation between Harrison and the Project’s leader. She knew that Harrison was in to riding bicycles and wondered if he could help build a few. One thing led to another, and he ended up with a wrench in his hand the week before Christmas 2016. Clinicians in contact with families provide a list with information such as age, height, and gender. An anonymous donor contributes money. Harrison and a few others convert the money into age appropriate bicycles. The clinicians then pick up the bicycles and deliver them to the families. The process is ‘double blind’ in the sense that the providers and recipients of the bicycles will never be introduced. Harrison completing the 2017 Bicycle Build: Six bikes and one Scooter. “That doesn’t stop me from wondering, though” Harrison said. “Sometimes, I’ll be out on the Rec-trail, see a kid coming and wonder ‘did I build that bike?’” The bicycles are all brand-new, and a helmet is provided with each. “A bicycle isn’t just a toy for a kid on the [Monterey] Peninsula. It’s exercise, it’s a way to get to school and work, it’s a way to put everything behind you – if only for a few minutes.” Harrison says that he prefers to get the bicycles un-built from stores if he can, because he can fit more in his car that way. Kitsilano Neighborhood House Operations Research Analyst Lucia Darrow spent her volunteer hours at the Kitsilano (“Kits”) Neighborhood House, helping out with the Kits Club after-school childcare program. The Kits House develops programs to meet the needs of the community, ranging from childcare and senior living options to hosting farmers markets and ESL circles for newcomers to the city. Through volunteering with the Kits House and assisting with special events, Lucia says she enjoys connecting with the community and learning about the rich history of Vancouver’s Westside. Lucia on the steps of the Kitsilano Neighborhood House. Samaritan’s Purse Operation Christmas Child Norm Reitter, our Director of Analytics, spent an afternoon at a Samaritan's Purse run "Operation Christmas Child" gift distribution center where he inspected and enhanced gift boxes that were collected from many donation sources. These boxes were then routed through the Denver, Colorado distribution center and shipped to children in need who would not otherwise get Christmas gifts. Operation Christmas Child counts on thousands of volunteers to collect and process millions of shoebox gifts every year. Samaritan's Purse provides this approach so that kids get meaningful and useful Christmas gifts. Norm and Annalisa were busy inspecting donations, adding age appropriate items to gift boxes, and packing the gift boxes into larger containers for shipping. Norm said that seeing all the donations and knowing the positive impact on each child that would receive a gift box made this a very meaningful experience for him and Annalisa. Norm and Annalisa at their local Colorado Operation Christmas Child gift distribution centers. In Closing CANA Foundation has enjoyed a wonderful inaugural year of growth and giving back to our communities. We are excited to continue our upward momentum and build upon that success. In 2018, we will continue to create more opportunities for our team to participate, facilitate our team’s ideas to give back, and continue to develop meaningful relationships with other organizations. Onward and upward!! If interested in learning more about the CANA Foundation or in partnering with us, please reach out to Kenny McRostie, our CANA Foundation manager, at kmcrostie@canallc.com. #CANAFoundation #CANAAdvisors #givingback #charity #community #support #bicycles #Kitsilano #KitsHouse #Samaratin #operationChristmasChild
- Using the SEAL Stack
Recently, we needed to develop a desktop application for one of our clients. As web developers, our immediate thought was to use the SEAL Stack (http://sealstack.org). SEAL is a technology stack that uses SQLite, Electron, Angular, and Loopback. Why use Electron? Electron (https://electronjs.org) gives a developer the ability to build cross platform desktop apps with JavaScript, HTML, and CSS. It is a framework developed by GitHub. It combines Node.js, which is a JavaScript runtime that allows you to run JavaScript on the desktop, with Chromium, the open-source technology behind Google’s Chrome browser. This allows a developer to develop as if it is a web app, but from the user's perspective. It functions as a single application. Electron is used in many popular applications including Slack, Microsoft Visual Studio Code, and tools from GitHub. Why use Angular? Angular is a front-end web framework developed by Google. (https://angular.io). It makes writing single-page apps easy. It uses declarative templates for data binding and handles routing. It promotes component reuse across your application, making your code more stable. Angular uses TypeScript, an extension of JavaScript that adds strong typing. Running a Web Server An interesting twist to the project, was that there was a high probability that the client would want it converted to a web app in the future. Aside from the benefit of being able to develop with familiar web technologies, Electron gave us the ability to easily transition to the web at a later date if needed. With this knowledge in mind, we decided from the very beginning to build the app like a standard single-page app and use Electron to run it. Because Electron runs on Node.js, it was easy to spin up a server within the app. In the future, if we need to transition the app to the web, it will simply require deploying the code to a web server (and a few additional tasks such as changing data connectors to connect to a database server, adding authentication, etc). Why use LoopBack? For the web framework, we chose LoopBack (https://loopback.io). LoopBack is a highly-extensible, open-source Node.js framework. It is built on Express, the most popular Node.js framework. It makes it easy to quickly create dynamic end-to-end REST APIs. It has an ORM and data connectors for all the standard databases making it very easy to retrieve and persist data. Why use SQLite? By default, the LoopBack boilerplate configuration uses memory for data storage. Because we needed the data to persist between sessions, so we decided to use a database for data storage. In this case, we chose SQLite (https://sqlite.org). Benefits of SQLite include not having to install a database server on the user’s computer. SQLite is public domain and works across many different platforms. The data is stored in a single .sqlite file that can be transferred from one computer to another if needed, which could help with syncing data between users in the future. To avoid any issues running SQLite cross platform, we used a Node.js implementation of a SQL parser called sqljs. and wrote a custom connector for Loopback using sqljs (https://github.com/canallc/loopback-connector-sqljs). System Architecture Here’s a diagram illustrating how the four elements of the SEAL stack integrate together. Wiring it Up The easiest way to get started with the SEAL stack is to use the quick-start project (http://sealstack.org). The site is well documented. It also provides instructions for modifying an existing application to use the SEAL stack. This article was a collaboration between CANA Advisors Principal Software Developer Dan Sterrett, and CANA Advisors Senior Software Developer Aaron Luprek. For more programming articles, information on on SEALstack and other projects in development visit CANAadvisors.com #SEALStack #stack #SQLite #Electron #Angular #Loopback #framework #developer #desktopapp #JavaScript #TypeScript #WebServer #AaronLuprek
- How Learning French Refreshed My Analytical Strategy
A few months after graduating with an advanced engineering degree, I find myself back in the classroom, this time for my first class of beginner French. All about me I hear snippets of broken French from my Canadian classmates: phrases, simple sentences and questions. I know three words, which I can pronounce in a distinctly American way: bonjour, merci, and croissant. The “beginner” level of French language for Canadians, it turns out, is a little different from the “beginner” level for an American. I reassure myself that I’m a fast learner and struggle through the first class. After years of focus in one area of work, it’s natural to grow confident in your carefully crafted method of learning and doing. Once varied problems start to take on familiar forms, and it becomes easier to prescribe a certain solution. Stepping into French, I realized my tried and true approaches to learning were not going to prove effective. Several months later, here are some lessons I learned. Failing: Fast and often. I find the most difficult part of language acquisition is not grammar or syntax, but the inevitability of mistakes. Regarding mistakes as taboo creates a major roadblock to personal improvement. The same holds true with solving a difficult analytics problem. Instead, sharing in-progress or flawed work with colleagues helps to break through the small failures and clear a path to a robust solution. Out with the old and in with the new – Suppressing instinct and embracing a new technique. As with many language learners, my first instinct when I don’t know a word is to simply throw in the word from another language. Similarly, we tend to retain old sentence structures, until the structures of the new language become natural. R users can understand how this relates to learning the dplyr workflow or transitioning to functional programming. While these changes feel like a major paradigm shift at first, the impact on future work can prove invaluable. Analytics MacGyver. Asking someone about their aunt’s profession can sound something more like “What does your mother’s sister do in life?” coming from a novice speaker. This roundabout method may sound silly, but is arguably better for the learning process than simply inserting words in English. Analytics professionals must also be bricoleurs, utilizing many resources, tools, and experts to make complex and unfamiliar problems tractable. Diving in and staying in. Immersion and persistence are key to language acquisition. In analytics, methods are rapidly changing and improving. Attempting to become proficient in every new technology can be tempting, but dedicating time to one technology allows for quicker mastery. Abstraction and derivation of meaning. In the early stages of learning, every interaction with a new language can feel like a game of abstraction, as we try to translate back to our mother tongue. As sentences become phrases, then complex sentence structures, the problem becomes a greater puzzle. Here is where I’d argue that many analytics professionals would find joy in the challenge of language acquisition - the feeling of successfully working through a verbal puzzle and constructing a response, hopefully more expressive than oui or non. Lucia is an Operations Research Analyst at CANA Advisors. To find more content on learning and leveraging analytics, continue to visit our CANA Connection. #learningFrench #strategy #Analytics #LuciaDarrow #R
- Fake News: A Problem for Data Science?
Over the past year, "fake news" has become a topic of particular interest for politicians, news media, social media companies, and... data scientists. As this type of news clutter becomes more prevalent, individuals and organizations are working to leverage computing power to help social media users discern the "fake" from the legitimate. In this article, we take a look at some basic natural language processing (NLP) ideas to better understand how algorithms can help make this distinction. Natural Language Processing: A Brief Introduction Text Preprocessing: Arguably the most important step to text mining is preparing the data for analysis. In NLP, this involves actions such as tokenizing words, removing distinctions between upper and lower case words, stemming (extracting the root of words), and removing stop words (common words in a language that don't carry meaning-- think: the, and, is). An example of tokenization and stemming is shown below in Figure 1. Bag of Words: This model is useful in finding topics in text by focusing on word frequency. Bag of words can be supplemented with word vectors, which add meaning to NLP representations by capturing the relationship between words. Text as a Graph: Graph-based approaches consider words as nodes and focus on associations to draw more complex and contextually rich meaning from text data. Named Entity Recognition (NER): This method can be used to extract types of words, such as names, organizations, etc. Many NER libraries are online for public use. Sentiment Analysis: Otherwise known as "opinion mining," this technique provides a gauge of the author's feeling towards a subject, and strength. Do fake news outlets produce more opinionated articles? # Tokenization and Stemming Example headline <- "The Onion Reports: Harry Potter Books Spark Rise in Satanism Among Children" tokenize_word_stems(headline) ## [[1]] ## [1] "the" "onion" "report" "harri" "potter" "book" ## [7] "spark" "rise" "in" "satan" "among" "children" Figure 1. Tokenization and Stemming Example How Are Data Scientists Framing the Problem? While popular browser extensions use crowdsourcing to classify sites that publish fabrications, researchers are reframing the problem of fake news. In order to fit a model, an understanding of the most influential features that differ between fake and legitimate is helpful. Regardless of whether the fake news is created by provocateurs, bots, or satire, we know it will have a few things in common: a questionable source, content out of line with legitimate news, and an inflammatory nature. Current research in the area takes advantage of these truths and applies approaches spanning from naive Bayes classifiers to random forest models. Researchers at Stanford are investigating the importance of stance, a potential red-flag trait of misleading articles. Stance detection assesses the degree of agreement between two texts, in this case: the headline and the article. Another popular approach is the use of fact-checking pipelines to compare an article's content to known truths or an online search of a subject. As the complexity of fake news adapts to modern modes of media consumption, research in this space will expand. Image classification is a likely next step, albeit one that poses a major scalability challenge. Interested in learning more or building your own fake news classifier? Check out these resources: Python's Natural Language Processing Toolkit R's NLP Package Python's SpaCy for NER Our analysts at CANA Advisors are always interested in hearing from you. If you have an interesting “data” dilemma, contact Lucia Darrow. [EMAIL] #fakenews #science #NER #NLP #NaturalLanguageProcessing #tokenization #stemming #datascience #LuciaDarrow