Updated: Oct 16, 2020
2020 has turned the corner, and a new year fast approaches. Leaves are turning - for some, schools are back in session - for some, and - for everyone, each day brings a variation of change, growth, uncertainty, and excitement. Then again, who would have imagined the “new normal” would be, simply, normal? Here at CANA, we continue to work in the way we always have: expertly managing, interpreting, and analyzing data to answer questions and solve problems. CANA’s commitment to excellence and innovation remains constant. In this month’s newsletter, we’ve highlighted some of this work in a profile of Jerome Dixon, a Senior Operations Research Analyst at CANA, and his ongoing efforts in healthcare analytics and a wrap-up of our virtual two-day Veterans’ Analytics Course, an event that culminated in an interactive panel discussion with a diverse group of leaders in the field of data analytics. We’re sending them a huge thank you from Team CANA for sharing their helpful insights!
We hope this month and next continues to find you and yours well. Take in the traditions: experience the fun of Halloween, participate in our nation’s democratic processes in early November, and look forward to friends and family at Thanksgiving. The more things change, the more, in fact, they stay the same. Happy October!
~ Team CANA
DEEP LEARNING IN THE CANA CAR: A CANCER TRIAL DATA SCIENCE CHALLENGE
A look at the CANA Analytics Roundtable
By Jerome Dixon | Senior Operations Research Analyst
and Cherish Joostberns | Resource Coordinator
As a Senior Operations Research Analyst at CANA Advisors, I’ve had the opportunity to apply both my military logistics experience and healthcare analytics expertise to a variety of challenging problems. I recently shared a learning experience with peers at our monthly CANA Analytics Roundtable (CAR). The CANA “CAR” is a monthly gathering open to all hands of the company, which highlights employee-chosen topics related to analytics techniques and technologies. Each monthly CAR typically hosts four to five short presentations. During our most recent roundtable, I talked about my strategy and initial execution of a cancer trial data science challenge hosted by Oak Ridge National Laboratories. The challenge set forth was to analyze cancer patient information and data sets to appropriately determine an individual’s assignment to a select clinical trial. I identified 25 input features within the patients’ information database that would eventually feed into the target variable – “Selected for Study, Yes or No.” I noted a specific need for improvement based on doctors’ feedback: trial names did not always match to the relevant disease site, thereby missing a critical linkage point between patients and potential trial participation.
I used PyTextRank as one means to address the data challenge. Although the “bag of words” is a common model in text mining, it focuses mostly on simple word identification and count. In this instance, I felt Python’s PyTextRank library was the right tool, given the types of trials and abstracts in the challenge, to classify titles to the correct cancer sites. PyTextRank can be used not only for identification and counting but also to select keywords, assign importance to the word, and build summary sentences from text. I used Pytextrank to review the summaries and to establish an Eigenvalue centrality metric that ranked node importance based on not only the number of connections but also the quality of the connected nodes, essentially creating a network strength metric.
Another critical tool was the R deep learning API, Keras. Although it appears a difficult language to work in, it seemed most of the heavy lifting work was done in data preprocessing to put the data in matrix vector format. Keras was critical in addressing my intent to define and train the model to appropriately match a large body of cancer trials to specific cancer anatomical sites, e.g., brain, breast, prostate, etc., thereby enabling efficacious patient match-up to a potentially useful trial. In order to put these different elements together, I used Reticulate to embed Python Pytextrank in R. This approach was fairly effective, and I was able to demonstrate initial iterations of my model.
As I continued through validation and analysis of my approach, I realized the classification model did not produce what I considered significant results. I need to further feature engineer the text corpus dataset and improve the model's input features. This iterative process will help determine features that best represent, classify, and connect the data flowing into the model to provide optimal results. My next steps are to experiment and test out the methods used here in https://cloud4scieng.org/2020/08/28/deep-learning-on-graphs-a-tutorial/.
This deep learning approach may reveal more about the underlying structure of the cancer study data; define the nodes and edges that detail its connections and features; identify or predict links and communities; and enable classification between classes. I intend to, quite literally, connect the dots of the data to solve this cancer clinical trial challenge.
TEAM CANA MEMBER SHOWCASE
Jerome Dixon | Senior Operations Research Analyst
"There is no such thing as was-only is. If was existed, there would be no grief or sorrow.”
Jerome is a proven leader with 20 years of military experience. He uses a systems thinking approach that utilizes today’s most advanced data science tools for solving problems. His areas of expertise include: supply chain, logistics, aviation maintenance, information technology, and analytics. He is currently working in the healthcare space with the Virginia Commonwealth University (VCU) Health System and the Commonwealth of Virginia’s Department of Behavioral Health and Developmental Services (DBHDS). Jerome’s work is focused on hospital informatics, hospital information systems, and precision medicine with genetic datasets. He is also active in the Institute for Operations Research and Management Science (INFORMS) and the System Dynamics Society (SDS).
Previously, Jerome developed a series of forecasting models for the U.S. Marine Corps’ Bill of Material (BOM) and maintenance repair support. These models incorporated machine learning, natural language processing, forecasting, and data visualization best practices. He performed a literature review and research into additional machine learning models and algorithms applicable to the U.S. Marine Corps’ supply and maintenance business practices. His preferred tools were R, Python, SAS Enterprise Miner, Apache Spark, Amazon Web Services, Databricks, and Microsoft Power BI.
Jerome’s primary career focus in 20 years of analytics and process improvement experience is the alignment of people, process, and technology. He believes culture and analytics are critical to achieving alignment of people, process, and technology for maximum organizational effectiveness. Jerome has held several U.S. Navy operational assignments, a U.S. Navy IT product development assignment, and a Defense Logistics Agency (DLA) Aviation program management tour. He is a former F/A-18 Weapon Systems Support Officer. Jerome has a wide and deep domain knowledge in the logistics space that he is leveraging and transitioning to the healthcare space, and more specifically, to personalized medicine. Jerome believes his Navy background and training have set him up well in this arena.
“Personalized medicine is a very challenging concept where you are aligning the most recent research and healthcare techniques to a specific patient with information that can be acted on to reduce hospital costs and improve patient outcomes. Significant process improvement, information technology, analytical frameworks, disparate data, and cultural challenges all exist.”
Not only is Jerome a finalist in the Virginia Beach Biotech (VaBeachBio) Innovation Challenge, He recently participated in the Oak Ridge National Laboratory’s (ORNL) Data Science Challenge. This challenge can be found here in more detail: https://github.com/Jerome3590/Using-Artificial-Intelligence-Techniques-to-Match-Patients-with-Their-Best-Clinical-Trial-Options
Fun Fact: Jerome’s favorite album is the
‘Lucero Live From Atlanta’ album!
You can follow Jerome’s github account for project updates or contact him at email@example.com!
Jerome's social media links
VETERANS' ANALYTICS EVENTS
CANA Advisors Learning and Development
By Lucia Darrow | Senior Operations Research Analyst
In September, the CANA Foundation partnered with CANA’s Learning and Development program to create an analytics course tailored to the veteran perspective. As a veteran-owned company, we see the immense value that veterans bring to our team and projects every day. This course gave our team a chance to give back, make connections, and share their unique perspectives on the skills needed to succeed in government and commercial analytics work.
Several CANA team members, including Walt DeGrange, Jason Fincher, Connor McLemore, Rocky Graciani, and Kim Mamula volunteered their time to the development and delivery of the course. The course focused on concrete career advancement through analytics and coding skill development, portfolio development, and connection to the analytics community. Students were encouraged to uncover their “analytics superpower,” the crucial intersection of their interests, coding skills, and mathematical knowledge. The course was concluded with a community networking event: The Veterans in Analytics Panel Discussion. CANA was joined by Joshua Wilson (America’s Warrior Partnership), Randi VanNyhuis (The Walt Disney Company), John Alexander Harris (Boxelder Analytics), Daniel W. Hudson (ReefPoint Group, LLC), and Jerome Dixon (CANA) for an exciting and honest conversation about breaking into a career in analytics and data science.
“I attended the CANA Advisors Veterans’ Analytics Course in September 2020. This free course was taught entirely by veterans on the CANA Advisors staff. Course topics ran the gamut from portfolio development to prescriptive analytics and modeling. The course content was pertinent, insightful, and well delivered by industry professionals. I sincerely appreciated the time and effort shown by the class instructors to give back to the transitioning veteran community. I especially enjoyed the panel discussion held afterwards, where several of the advisors engaged in a more free-form talk with a Q&A session. I gained invaluable contacts and advice from the advisors and my fellow attendees. I believe all the attendees were truly thankful for the efforts of the CANA advisors, especially given the stress of the current environment. It was a highly successful event from my perspective, and well worth repeating.”
-Al Bellamy, an attendee of the Veteran’s Analytics Course.
If you have any questions about the CANA Foundation, its initiatives, and partnerships, please reach out to Kenny McRostie, CANA Foundation Manager, at firstname.lastname@example.org or visit our website at http://www.canallc.com/giving-back. If you would like to learn more about CANA’s Learning and Development offerings, please reach out to Lucia Darrow, Senior Operations Research Analyst, at email@example.com or visit http://www.canallc.com/learn.
Interested in upcoming analytics training or webinars? The Learning and Development program at CANA Advisors would like to hear from you. Please take this survey to share your thoughts!