4 minute read
UD PCS provides data analyst training options
Two programs teach in-demand skills
By Adam S. Kamras
Advertisement
While data is easier than ever to collect and store, knowing what to do with it and how to analyze it is a challenge, especially for those without the proper training. From boosting customer acquisition and retention, managing risk, and identifying the source of product performance problems, to predicting competitive bond-buying bids, putting together a winning baseball roster, and countless other functions, skilled professionals possessing the unique combination of computational, analytical and communication proficiencies necessary to discover data-supported solutions to important business questions are invaluable to an organization’s success.
Employed in numerous industries, a variety of educational paths can be taken to hone one’s data analyst skills. Two of these routes are provided by UD PCS via its Predictive Analytics and Data Mining Certificate and Foundations of R for Data Analysis Certificate programs. Depending on a person’s interests and background, either course—or both—could be a good fit.
Predicting outputs as a function of inputs
Taught by a pair of retired DuPont employees, Steven P. Bailey and Aaron J. Owens, Predictive Analytics and Data Mining addresses how to define the goals of a project, identify or collect appropriate data, analyze the data to determine a solution, and communicate the results effectively to others.
“The more data you have, the more you are going to get a lot of statistically significant information that may not be of practical importance or good at predicting the future,” said Bailey. “If suitably organized into a spreadsheet or a worksheet, we can use a number of techniques all focused on coming up with models that predict one or more outputs as a function of our inputs.”
—Ryan Harrington
Whereas predictive analytics refers to the use of statistics and modeling techniques to make predictions about future outcomes and performance, data mining is a process used to turn raw data into useful information.
“The whole purpose of what we’re trying to do is get a model that is exactly the appropriate complexity for the data set that’s sitting there,” said Owens.
JMP Pro is the primary analytics software used throughout the Predictive Analytics and Data Mining Certificate program. The menu-driven commercial software package does not require programming skills and allows users to perform predictive modeling and cross-validation techniques as well as other actions.
Perform data analysis with R
Rather than using a software tool like JMP Pro—or in addition to it— some data analysts employ programming languages to perform their tasks. Foundations of R for Data Analysis instructor Ryan Harrington’s programming vehicle of choice is R, a free and opensource statistical language that enables users to extract, clean, visualize and model data.
Using R to support learning the basics for an aspiring data analyst or a data scientist while focusing on the mechanics of programming with it and not on statistical modeling techniques, Harington, the director of strategy and operations, Delaware Data Innovation Lab at Tech Impact, said he is teaching a data analytics course supported by computer programming rather than what he would call a true programming course. Though programming is the means to the end, his goal is to train the students to be capable data scientists or data analysts.
“I’m not just teaching the language of R; I’m teaching the mindset for how a data analyst would go about doing their job,” said Harrington. “The Predictive Analytics and Data Mining course teaches the methods, and Foundations of R for Data Analysis teaches about a programming language that can be used to perform the methods.”