Data Wrangling and Visualization with Tidyverse in R | R Programming
Data Wrangling and Visualization Program Overview
Why Learn Data Wrangling and Visualization using Modern R - Tidyverse?
According to a New York Times article by Steve Lohr (2014), data scientists spend 50% to 80% of their time on data wrangling (i.e., data cleaning and transformation) processes and 20%-50% of their time on data modeling, implying the importance of skills needed for the data wrangling task.
However, most degree programs focus on data modeling, presumably because that is the most technically challenging and worthy of a degree. Most courses in various types of data science programs do not offer a course in data wrangling and visualization systematically, but they expect students to use these skills in conjunction with modeling, making students face two challenges simultaneously. The same is true in most statistics classes. Students have to learn not only statistics but also programming software. To make things worse, statistics classes often use Base R, which is not easy to learn and turns off most initial enthusiasm that students might have had earlier.
Learning how to code can be and should be much easier. In fact, modern data science with R heavily uses the Tidyverse package, which is easier to learn and fun to use. When it comes to data visualization, for instance, modern R for data science has the most popular data visualization tool called ggplot2 package. Using ggplot2, you can customize your chart creatively and persuasively in almost any way you want to support your strategic storytelling in your presentation.
Why This Certificate Program?
This certification is designed to help both novices and intermediate users of Base R by giving them the necessary knowledge in the Tidyverse way of coding for data wrangling and visualization so that they can focus more on statistics or machine learning topics when they take those classes. Also, this course aims to give data science aspirants adequate knowledge and skills for the most popular tasks for data analysts and data scientists – Data Wrangling and Visualization to help them get started with their careers.
The course will follow the recommendation of Hadley Wickham, who is the chief architect of the popular Tidyverse mega package you will learn throughout the course. We will start with the most important aspect of Tidyverse, data visualization! After you master the topic, we will teach you the bolts and nuts of how to manipulate and transform all sorts of data to support the kind of visualization you want to create for your storytelling. Each time you learn how to wrangle data, we will not stop there. You will be guided to produce a chart, the final line. This way, you will never stop applying your visualization skills across all 13 modules.
The certification includes 12 modules and a capstone project, which are to be completed in about 13 weeks. The course covers all aspects of wrangling and visualization of data comprehensively using the Tidyverse framework – modern data science in R – to allow you to prepare for a presentation of complex data for the non-technical audience with a focus on strategic data visualization. Please see the Course Outline below for the content of each module.
The price is set to be competitive with online courses available elsewhere, giving you unparalleled value for the quality of education you will get. We invite you to take the time to learn about how we can help you on your journey. Contact us at Insights-Lab@cpp.edu if you have any questions.
Skills Covered
Strategic data visualization, literate coding, reproducible research, wrangling for all types of data, making interactive or animated charts and dashboard, modern wrangling and visualization skills using Tidyverse in R, importing and exporting data from and to the web, cloud, and local drive, joining multiple relational datasets, dealing with labeled data (e.g., SAS, SPSS, STATA), functional coding, creating report in HTML, pdf, and ppt, automation and simplification of repetitive tasks, etc.
Tools Covered
R, RStudio, R Markdown, dplyr, tidyr, stringr, forcats, lubridate, readr, purrr, ggplot2, magrittr, Tidyverse, plotly, ggrepel, ggthemes, GGally, gganimate, Github, web scraping with SelectorGadget, heaven, broom, and patchwork, naniar, here, scales, labelled, sjlabelled, sjPlot, KableExtra, glue, etc.
Prerequisite
None. No prior coding experience is needed.
Expected Outcome
Participants will receive a Certificate of Completion upon successful completion of the program. Your letter grade will appear on a professional program transcript, but the course will not be counted towards the academic crediting-bearing degree program. You can list the certificate in your resume. You will also come away with over 20 codebooks and a comprehensive capstone project, which will aid you in future endeavors. Upon completing the course, students will be able to:
- Explain the role of data visualization as part of persuasive presentation and storytelling.
- Choose the most effective visualization method appropriate for a given data and audience characteristics.
- Wrangle data to support strategic visualization objectives.
- Generate insights and recommendations and present them effectively and persuasively.
- Practice reproducible data visualizations to promote transparency, credibility, and ethics.
Course Outline
Course Overview
There are a total of 12 modules plus a capstone project in this course. In Modules 1 and 2, you will learn about data science in R and its ecosystem for modern data science that utilizes the concept of tidy data manifested through the Tidyverse mega package. You will learn about R’s capability and compare it with Python. In Modules 3 and 4, you will learn how to visualize the data strategically to support the kinds of storytelling for your audience, which is the final goal of this course. By knowing the stories you want to tell, you will set your visualization goal from the very beginning.
The rest of the modules are devoted to teaching you the tools you need to shape the data to support data visualization. That is, you will learn how to wrangle the data to support the visualization in Modules 5 and 6 (How to tidy data and transform data), Modules 7 and 8 (How to deal with various types of data format such as strings, factor, date, and time), and Modules 9 and 10 (How to import and export various forms of data). Modules 11 and 12 (Programming) are not part of the wrangling process, but adding programming skills will help you save a lot of time by automating repetitive coding tasks. You will have learned most of the concepts and skills needed for the Capstone project by the time you finish Module 12.
The capstone project will be introduced right after Module 4 so that you can start tackling the problems each time you learn important concepts. See which questions you can answer after you finish Module 4. Revisit the capstone project again after Modules 5 and 6, again after Modules, 7 and 8, and so on and on. In the process, you will continue to modify your coding and improve it better and better each time you revisit it. By the end of Module 12, your capstone project should be ready for submission.
In this module, you will learn how to install R and RStudio. You will also learn how to make the best use of the R Markdown within RStudio. RStudio is an IDE (Integrated Development Environment) that makes learning R much easier. In RStudio, you can run not only R codes but also Python codes or codes for other programming languages. RStudio is wonderful. With this preparation done, you will learn what you can do with R. While the program will teach you Base R, it will focus on the Tidyverse approach of data science with the use of the Tidyverse package.
Upon successful completion of this module, you will be able to:
- Install R and RStudio.
- Describe the layout and menus of RStudio.
- Start, Run codes, and Save an R script file.
- Install R packages and load them up.
- Start, Run codes, organize the codes, and save the R Markdown file.
The goal of this module is to introduce an overview of the world of data science in R and get you ready for the rest of the modules.
Specifically, you will learn the universe of R and data science in general in this module. As R spreads to the academic and research community, more and more college students are learning statistics with R. R was initially developed as a statistical tool. What else can we do with R? What do data scientists do with R?
For starters, we can create a chart any way we want with ggplot2. You can create and update Word, HTML, PPT, and PDF files right from the R markdown file. You can animate your charts and make your charts interactive. You can create a website and a dashboard with shiny and shiny Dashboard. You can build machine learning models with Caret or tidy models. You can run Python in R with the reticulate package. Some packages help you create charts with a menu-driven approach (e.g., ggThemeAssist and esquisse).
In this module, you will learn R's capabilities and resources that you can use to learn the skills. There is no way you can learn everything in this short course. Thus, we will focus on some fundamental topics, while you will be led to resources for advanced topics you can tackle in the future.
Upon successful completion of this module, you will be able to:
- Describe the concept of the Tidverse way of coding.
- List the Pros and Cons of R and Python for data science.
- Describe the capability of the Tidyverse package in R.
- Explain how to use online resources provided by the R community.
The ggplot 2 is a plotting package that provides helpful commands to create complex plots from data. It provides a more programmatic interface for specifying what variables to plot, how they are displayed, and general visual properties. Therefore, we only need minimal changes if the underlying data changes or if we decide to change from a bar plot to a scatterplot. This helps in creating publication-quality plots with minimal amounts of adjustments and tweaking. Therefore, it is not surprising that ggplot2 is included in Tidyverse.
In this module, you will be introduced to concepts of the grammar of graphics when visualizing data with the ggplot2 package. Every ggplot starts with defining data and the aesthetics of variables, which is the foundation. Then, you add a geometry layer over the foundation. The two are the basics of any ggplots. The rest of the layers are optional. You will also be introduced to how to create charts when there is only one (or more) continuous or categorical variable(s) in your data set. Furthermore, you will also be introduced to plot charts after you select and filter the variables you want to plot in your data set.
Upon successful completion of this module, you will be able to:
- Explain the concept of the grammar of graphics when visualizing data with the ggplot2 package.
- Be familiar with various types of charts.
- Visualize counts, proportions, and geospatial data.
- Select appropriate charts based on strategic considerations (e.g., the characteristics of the data and audience).
- Create a chart that involves one or two variables.
- Create a chart by adding a categorical moderator (3rd variable) to the chart involving two variables.
- Create correlation charts.
In this module, you will be introduced to advanced topics of the grammar of graphics and ggplot2 extensions such as patchwork and gganimate. You will also observe how data scientists use ggplot2 to customize charts for better communication, including live screencast demonstrations of clever and creative data visualizations. Advanced users or motivated beginners are encouraged to tackle the advanced visualization codebook with interactive, animated, or geospatial charts.
Upon successful completion of this module, you will be able to:
- Customize correlation charts.
- Create a chart that involves four or five variables by adding layers of geometry.
- Modify charts for storytelling by customizing axes, labels, coordinates, themes, etc.
- Explain when to use interactive or animated charts or dashboards.
- Visualize geospatial data on a map.
- Read charts and generate insights.
Understanding the concept of Data Wrangling is important because many professionals inside technology industries have to face different data types and also deal with the various sources of data. In this module, you will be introduced to Data Wrangling. In a broad sense, data wrangling includes (1) importing data, (2) tidying data, and (3) transforming data. After the wrangling process, you can proceed with visualizing data and modeling.
In this module, we will be focusing on the last two stages of the wrangling process -- tidying and transforming data. You will also be introduced to a modern data type called, tibbles and learn how tibbles are different from data frames. One of the first steps in data wrangling in the Tidyverse ecosystem is to reshape the data to be tidy using "tidyr" package. You will learn the concept of tidy data, and how reshaping the data helps you visualize the data. Once you make your data tidy, you will want to transform your data with the help of "dplyr" package. Learn some key functions from dplyr to transform the data. In base R, one had to use several parentheses together, adding a code inside the parentheses and inside yet another parenthesis, making coding complex. This makes your R code hard to read and understand. In modern R, you can zoom on data, using the pipe operator (%>% or |>), which allows information to move from top to bottom through a pipe. That is, you first start with data, then pipe into the data to tidy or transform it first. Once wrangling is done, you pipe the wrangled data into ggplot() to do all the necessary mapping and geometry as well as other optional layers of coordinates and themes, according to the grammar of graphics.
Upon successful completion of this module, you will be able to:
- Describe the concept of Data Wrangling.
- Describe how Tibbles are different from data frames.
- Explain how to convert wide or long data to "Tidy" data.
- Explain how to merge relational data sets using joins.
- Be familiar with key dplyr verbs and use them to transform data.
- Use the pipe operator to shape the data to prepare for analysis and visualization.
Continuing from M05, you will expand your horizons by going deeper into the topics with various approaches.
First, you will learn how data scientists wrangle and visualize data for their projects by watching demonstrations. Next, you will be exposed to more advanced topics and associated functions such as recode(), across(), case_when(), rownames_to_column(), distinct(), rowwise(), and c_across().
Upon successful completion of this module, you will be able to:
- Describe the concept of Data Wrangling.
- Describe how Tibbles are different from data frames.
- Explain how to convert wide or long data to "Tidy" data.
- Explain how to merge relational data sets using joins.
- Be familiar with key dplyr verbs and use them to transform data.
- Use the pipe operator to shape the data to prepare for analysis and visualization.
Data types (numeric, character, logical, factor, dates) are building blocks of data structures (vectors, matrices, data frames, and lists). Most of the time, you are likely to work with data frames or Tibbles in the Tidyverse framework, which is a spreadsheet with rows of observations and columns of features or variables. Matrices are similar to data frames in that they consist of rows and columns. A major difference is that while matrices are composed of variables that are of the same data type, data frames can have a mix of any type of data. You can perform various operations on data to filter and view parts of data or to create new variables. Knowing the differences between the data types and structures will provide you with the basic knowledge needed to manipulate data later.
In addition to Base R functions, Tidyverse has many packages that help deal with different types of vectors much more efficiently than base R does. With the stringr package, you can wrangle strings (or characters) and regular expressions. With forcats, you can wrangle factor, the ordered data. With the lubridate package, you can wrangle dates and times.
Upon successful completion of this module, you will be able to:
- Explain atomic R data types - numeric, character, logical, factors, and dates.
- Explain the differences between data structures -- vectors, matrices, data frames, and lists.
- Understand when to use each data type.
- Create data frames.
- Use popular functions from base R and Tidyverse to view and manipulate data frames that contain the various data types.
- Use popular functions from stringr package to handle strings and regular expressions.
- Use popular functions from the forcats package to manipulate factors in R.
- Use popular functions from the lubridate package to manipulate dates.
Building on what you learned in the previous module, you will deepen your understanding of the R data types with particular emphasis on tidyverse.
You will develop a practical sense of choosing appropriate functions from various packages associated with certain data types (stringr, forcats, and lubridate) by watching the live performances and well-prepared demonstrations on using those packages. Further, you will also be given the opportunity to drill with more advanced topics with those packages.
Upon successful completion of this module, you will be able to:
- Use advanced functions from base R and Tidyverse to view and manipulate data frames that contain the various data types.
- Use advanced functions from stringr package to handle strings and regular expressions.
- Use advanced functions from the forcats package to manipulate factors in R.
- Manipulate dates and times using advanced functions from the lubridate package.
- Utilize cheat cheats available online to find the right functions quickly.
Being able to import different formats of data is important because data scientists deal with secondary data collected by others, and secondary data exist everywhere inside and outside the organizations they work for. Therefore, learning how to import and export data is a fundamental step that we are introducing after you learn about data types and structures in Modules 7 and 8. You must use appropriate tools to import and wrangle the data depending on the data formats. Also, gathering data outside your company, such as websites, is a great way to extend your organization's capability, and you will be in the center of the action, adding value to your company; thus, web scraping can be another form of importing data that you can have under your took box. Further, you will learn how to deal with so-called labelled data produced by menu-driven proprietary statistical software such as SAS, SPSS, and STATA.
In addition to the Base R functions, Tidyverse has some packages that help you deal with different formats of data created by well-known software.
Upon successful completion of this module, you will be able to:
- Explain how to create a Github repository and collaborate with others on the same R projects.
- Effectively load and look through built-in datasets in R.
- Import various formats of data (csv, xlsx, and SPSS) to RStudio.
- Scrape data from the web using SelectorGadget and rvest.
- Import multiple external data sets and work with them.
- Work with labeled data in R.
- Export/save output data to local pc and push to Github.
Building on what you learned in the previous module about data import and export, you will learn how to share outputs more effectively in this module. That is, you will learn the concept of literate coding and reproducible research using tools such as R Markdown and Quarto. Second, you will learn how to install an interactive visualization dashboard online.
Upon successful completion of this module, you will be able to:
- Explain how reproducible research with literate coding enhances efficiency, ethics, transparency, and credibility.
- Explain how Quarter is different from Rmd.
- Describe how to make an interactive dashboard with Shiny R.
- Produce customized tables from imported SPSS data effectively in R.
- Create charts from imported SPSS data in R.
The ability to create a function of your own will make you get even closer to being a data scientist. Although R has several built-in functions of high usage, you will want to be able to create your own functions tailored for your job or tasks, because you found yourself or your team performing a particular task repeatedly. By building your own function, you can help save time.
In this module, you will be introduced to various types of vectors, logical operators, for-loops, if-else statements. You will also be introduced to good coding practice and see how this positively impacts your code's efficiency. For your code to be more readable, take fewer keystrokes, and execute a batch of jobs faster, you may want to use a family of map() functions from purrr package, which is part of Tidyverse, or use apply() family functions from base R. Furthermore, you will also be introduced to some built-in R functions, which are built by R architecture, using the same sets of principles you will learn in this module. After you finish this module, you should be considered a programmer.
Upon successful completion of this module, you will be able to:
- Describe good coding practices.
- Use the various types of logical operators.
- Use for loops in conjunction with other statements (if-else, next, break, etc).
- Create your own R functions.
- Use map functions from the purrr package to increase efficiency.
- Use a family of apply() functions to simplify repetitive tasks.
- Be familiar with some built-in R functions.
Building on the fundamentals of programming you learned in the previous module, you will deepen your understanding of R programming in this module. You will also practice more on a family of map() functions from purrr package. Further, you will be exposed to more packages and tools that allow you to go beyond the familiar Tidyverse ecosystem.
Specifically, we will go deeper into Functions and iterations with more explanations and examples to practice. Then, you will learn more about the purr package and its family of map functions with demonstrations and exercises, including walk(), walk2(), pwalk().
As the concept is hard, you will be given many examples with which you can appreciate the concepts better. For instance, you will be introduced to a way to test thousands of statistical tests using purr and broom packages.
Upon successful completion of this module, you will be able to:
- Explain a good coding style.
- Describe all components of a function and their roles.
- Set return values and describe the environment in a function.
- Explain the differences between various For loop variations.
- Map over multiple arguments using map2() and pmap().
- Use walk(), walk2(), and pwalk() in the middle of the pipelines.
Working on a hands-on project is a great way to solidify your learning. Such a comprehensive, practical project can be added to a resume, portfolio, or LinkedIn profile. Use all the tools you learned from all 12 modules. Sometimes, you may have to use additional materials given in each module.
The Capstone project will be challenging due to its scope and nature. If you worked hard on all 12 modules, you will have an idea to try, but you are not likely to remember all the functions you need. Review the codebooks and find the codes that you can apply to the new content.
Upon successful completion of this module, you will be able to:
- Be familiar with R data structure.
- Perform various operations to view and manipulate data
- Import various types of data and export output data to other types of data.
- Wrangle data to “tidy” form for visualization and modeling.
- Write codes to automate/simplify routine operation.
- Create charts, using R’s built-in functions as well as popular packages such as ggplot2.
Targeted Careers and Job Outlook
Data science is one of the top two jobs in America in 2021 (Glassdoor). According to the US Bureau of Labor Statistics, employment of data scientists is expected to rise 22 percent by 2030 – far faster than the eight percent average for all occupations. According to Harvard Business School, data science can be used for all areas of business – accounting, finance, manufacturing, management, marketing, and operations – for such tasks as gaining customer insights, increasing security, informing internal finances, streamlining manufacturing, and forecasting future marketing trends. Data science methods are used in hard science as well. There are several different jobs under the broad umbrella of data science – Data Engineer, Data Analyst, Researcher, Business Executive, Entrepreneur, Full-Stack Data Scientist, etc. “Yet, to harness the power of big data, it isn’t necessary to be a data scientist,” according to HBS. Whatever your goal is, this certificate program is designed to introduce you to the R programming language..
Key Features of Certificate Program
- Take the program anywhere in the world as the program is delivered online.
- Fully asynchronous offering, meaning that there is no set class time. Takes two weeks to finish one module and 12 weeks to finish all modules, including the capstone project.
- However, you will be required to manage your time such that the assignment associated with each module is required to be finished by the deadline set on Canvas.
- Each module will follow the Quality Matters framework that has been proven effective for online learning success. That is, each module will start with learning outcomes, followed by step-by-step instructions, including a one-hour video lecture, supplemental materials to reinforce the lecture, and assignment(s).
- Each module will have one Principle Assignment (optional with bonus points) and an Application Assignment.
- The Principle Assignment is intended for participants to spend the time watching lecture videos and organizing the learning, leading to a nice codebook that can be used for future reference. Since some experienced participants may prefer to skip the process, this assignment is not required, but participants who submit the assignment will be given extra credit and will be provided an additional set of codes that can expand what was taught in the video.
- The Application Assignment is intended to ensure that participants can apply what they learned to real-world situations that require critical thinking, problem solving, and creativity. Each Application Assignment is composed of about 10 questions that are usually connected to each other with the same data set, serving as a mini project.
- An instructor (professor with a Ph.D. with ample teaching and consulting experience) and the knowledgeable mentors at the Center for Customer Insights and Digital Marketing will be available to answer your questions immediately. See “Support for Learning” for details.
- To receive a certificate of completion, participants must receive at least a grade of D- from the course.
- You can choose to do only beginner level or both beginner and intermediate level as well. With two optional assignments and one required assignment for beginners and experienced coders per module, everyone should pass the course if he or she can spend 3 hours per week. You are welcome to save study materials into your local computer for future study/reference. To fully achieve the learning outcome during the course, however, one is expected to spend about 10 hours a week.
- Watching a video is never sufficient to demonstrate your knowledge and skills in the topic, which is why we give participants hands-on practice assignments - Principle Assignment (optional) and Application Assignment.
- The Capstone Project gives participants the opportunity to utilize everything they learned to import, tidy, transform, and visualize messy data. The project will be available for participants to tackle for 12 weeks and can be added to their resume separately, demonstrating their skills and confidence.
- Instructor. The instructor will provide timely feedback to your assignments so that you can use the feedback to improve your grades for the next assignments. The instructor will also provide advice on data analytics career in general and will provide in-depth coding document Rmd files that can expand your learning. The instructor will also offer office hours via Zoom for those who prefer personal interaction.
- 24/7 Support by Mentors. Our knowledgeable mentors will guide your learning and are focused on answering your questions, motivating you to stay on track during the 12 week period. To be successful in a certificate program, it is very important to stay on track. You can contact the mentors at any time for any quick questions you may have so that you are not slowed down in your progress. These mentors at the Center for Customer Insights and Digital Marketing are employees who are knowledgeable on the topics. Thus, they will provide the highest level of assistance under close supervision of the instructors.
- Learning Community. Students are encouraged to post questions and answer questions among themselves in our password protected Canvas, which is a leading Learning Management System, to create a safe online learning environment. Mentors and the instructors will also regularly check in to give support to the learning community.
Who Should be Enrolled in This Certificate Program?
- Students who want to take various data science programs (e.g., MS in Digital Marketing, MS in Business Analytics, etc.) and various statistics courses at undergraduate as well as graduate levels.
- Company employees who want to add a data science tool beyond MS Excel in their tool kit.
- Anyone who wants to have a career in data science and business analytics.
- Anyone who wants to learn R Programming from around the world are welcome to apply.
- Both novices and experienced users are welcome!-. There are optional capstone assignments that experienced R users may want to tackle. Likewise, there are optional Principle Assignments that new users of R would want to work on to ensure they have a strong foundation.
Program Offering Timeline
Following is the planned schedule for the next offering:
- Spring 2024: February 12 - May 12
Testimonials
Each time we offer the program, we conduct participants’ evaluations of the course to gather participants’ feedback on the course and instructor. Following are selected testimonials from the participants who took the anonymous course evaluation.
- “I enjoyed the step-by-step workshop video followed by an application assignment that pushed me to use the tools I learned to apply to code of my own. It was well-packaged and helpful to do alongside work and school.”
- “I really appreciate how organized this course was. With being a full time participant with many obligations, this course was very straightforward and clear to take. I really liked how the tools and resources were provided for us to use, and I was able to figure out my own problems when programming. I would recommend this program to others!”
- “I like how thorough the instructions and notes are throughout the Canvas course. I also like the videos since they are easy to follow. The feedback on the homework along with having the key are important to me to help improve my coding skills. I haven't attended office hours but I like the choice between two days in the evening. Whenever I have sent an email, I always receive a reply and/or feedback.".
About the Instructors
Dr. Jae Jung is a Professor of Marketing and the director of the Center for Customer Insights and Digital Marketing (CCIDM) at Cal Poly Pomona. He is also director of MS in Digital Marketing program. He received a Ph.D. in Marketing from the University of Cincinnati and an MBA degree with a concentration in Business Statistics from the University of North Texas. Currently, Dr. Jung is interested in applying econometrics and data science methods to consumer behaviors, and working on several projects in the area of social media and digital marketing dealing with firm level data, national level data, and individual social media data. His research has been published in journals such as European Journal of Marketing, International Marketing Review, Journal of Business Research, Journal of Cross-Cultural Psychology, Marketing letters, and Psychology & Marketing. Dr. Jung has taught various courses including Marketing Research, Data Mining for Marketing Decisions, and Marketing Analytics, often incorporating real-world proprietary customer data of companies. Dr. Jung orchestrated designing and producing a series of R workshops that attracted hundreds of participants from both campus and business community. This experience has led to the offering of DWV 100, which is a prerequisite for MS in Digital Marketing students if they lack coding experience. His research and teaching efforts earned him recognitions, including the prestigious Jagdish N. Sheth Research Award, Wall of COOL, Provost Teacher-Scholar Program Award, and Faculty of the Year Award.
Dr. Carsten Lange is a Professor of Economics. Dr. Lange received his Ph.D in Economics at the University of Hannover, Germany. Dr. Lange specializes in money supply, inflation, central bank policy, and economic impact analysis. Dr. Lange developed expertise in machine learning and AI including neural networks and deep learning, serving as a member of Cal Poly’s High Performance Computing Cluster Working Group and engaging with the campus community on the topics. Most recently, he used his expertise in analytics and computer technology to create a database that tracks the estimated number of active COVID-19 cases in the most populous counties in all continental states of the U.S. Collaborations include applying machine learning to analyze electronic properties of crystals, predicting Major League Baseball pitcher salaries with artificial intelligence, and Using GIS to Predict Urban Development in North Carolina. Dr. Lange’s research has been published in books and peer-reviewed journals such as the Journal of Risk Finance, International Journal of Monetary Economics and Finance, Quarterly Review of Economics and Finance, and Journal of Emerging Markets. Dr. Lange has taught a number of subjects at both graduate and undergraduate levels, including Economic Statistics, Mathematical Economics, Spatial Statistics and Analyses, Neural Networks, and Machine Learning. His teaching and research efforts earned him numerous awards and honors, including, Innovative Approaches to Instruction Award, Provost Teacher-Scholar Program Award, and Golden Leave Award.