COTE NOTE
The Center for Online Teaching Excellence
What I know about Data Analysis I would like to share what I know about data analysis
Jim Greenberg I am currently the Director of the Teaching, Learning, and Technology Center at SUNY Oneonta. I have worked at SUNY Oneonta 33 years helping to deploy technology in ways that improves teaching and learning (I hope). Along the way I have taught courses in Geographic Information Systems, Advanced Networking, various programming languages, and finally New Media. I’ve guest lectured and given workshops on numerous topics relating to technology over the years. I have served on committees at all levels most recently EDUCAUSE’s EQ Editorial Committee and SUNY’s IITG Reviewer Committee. Personally I am interested in how technology and culture interact, particularly in education. Some of the things I’ve been involved with over the years that I am most proud of are the establishment of the Teaching, Learning, and Technology Center on my campus and being in the room when COA and CIT were conceived.
The ability to collect, curate, analyze, and visualize large data sets is becoming critical in all disciplines. I would like to share how the University of Buffalo’s Center for Computational Research and SUNY Oneonta faculty have collaborated to build an environment (VIDIA), accessible to all SUNY students, that allows teaching and research in this emerging new field of “data science.”
What is it VIDIA is hosted by the CCR and was made possible through a 2013 SUNY Innovative Instruction Technology Grant (IITG) grant. The VIDIA site is powered by the HUBzero Platform for Scientific Collaboration, originally developed at Purdue University. HUBzero was specifically designed to help a scientific community share resources. Users can upload their own content, launch computations, and view results with an ordinary web browser, without having to download, compile, or install any code. The tools they access are not just web forms, but powerful graphical tools that support visualization and comparison of results.
How it works HUBzero is an open-source software platform used to create web sites or “hubs” for scientific collaboration, research, and education. It has a unique combination of capabilities that support science and engineering. In addition to allowing access to hundreds of applications through a web browser, HUBzero technology is a little like YouTube.com in that it allows people to upload content and “publish” to a wide audience. Instead of being restricted to short video clips, it handles datasets, analysis tools, and other kinds of scientific content. In that respect, HUBzero is a little like MIT’s OpenCourseWare, but it also integrates the content with collaboration capabilities. A little like Google Groups, HUBzero lets people work together in a private space where they can share documents and send messages to one another. A little like Askville on Am azon.com, HUBzero lets people ask questions and post responses, but about scientific concepts instead of products.
“The harvesting and
analysis of social media is an emerging tool in the social sciences. It has become increasing important that SUNY students have the opportunity to become familiar with these emerging methodologies.
Providing undergraduates in SUNY access to high performance computing for data analysis and visualization in all disciplines. Big data is sending ripples through all sectors of society. We track everything. And this trend is leading to a critical need for skilled professionals who can mine and interpret the data.
What I did
”
Working with a team of faculty at Oneonta and staff at the CCR I helped deploy this HUBzero environment with carefully selected tools and configurations so that undergraduates in social sciences courses (Sociology and Political Science) could complete assignments in social media analysis.
The Open SUNY Center for Online Teaching Excellence
July 11, 2014 • Volume 1 • Issue 3
COTE NOTE Staff
How I did it
The COTE Community Team: Alexandra M. Pickett, Associate Director, SUNY Learning Network; Martie Dixon, Assistant Academic Dean, Distance Learning & Alternate Programs, Erie Community College; Patricia Aceves, Director of the Faculty Center in Teaching, Learning & Technology, Stony Brook University; Lisa Dubuc, Coordinator of Electronic Learning, Niagara County Community College; Christine Kroll, Assistant Dean for Online Education, Graduate School of Education, University at Buffalo; Deborah Spiro, Assistant Vice President for Distance Education, Nassau Community College; Lisa Raposo, Assistant Director and Academic Programs Manager, SUNY Center for Professional Development; Erin Maney, Senior Instructional Designer, Open SUNY
Faculty in the social sciences identified desired student outcomes and we used this as a guide to evaluate software tools. Faculty wanted their students to get a “movie trailer” of what it was like to be a data scientist in their discipline. In addition, they wanted students to be able to test theories that are discussed in class. Three software packages were evaluated, Orange, R, Rapid Miner. All have sophisticated text and data analysis capabilities as well as visualization tools. Rapid Miner was chosen because of its ease of use and the ability for us to prepare processes in advance that undergraduates could use. R is being deployed this summer to expand the tool set for students and faculty.
This publication is produced by the Open SUNY Center for Online Teaching Excellence under the SUNY Office of the Provost.
Contact/Questions State University Plaza Albany, New York 12246 ContactCOTE@suny.edu
How to Submit Material This publication is produced in conjunction with the COTE “Fellow Chat” speaker series. Please submit a proposal at http://bit.ly/COTEproposal for consideration. Visit http://commons.suny.edu/cote for more information. To join COTE, visit http://bit.ly/joinCOTE
Faculty at Oneonta, with the help of CCR staff, learned how to deploy resources, use Rapid Miner, and build processes and datasets for students. Instructions, example processes, and data sets were deployed for students. Students in three courses at Oneonta created their own accounts, downloaded their datasets that they had created using another tool (Trackur) that was acquired under this grant, than ran Rapid Miner processes to analysis and visualize their data. Using these, students prepared reports and presentations as part of their course work. These assignments were designed for students to use data of their own interest and to test theories they had learned about in class.
Why I did it The harvesting and analysis of social media is an emerging tool in the social sciences. It has become increasing important that SUNY students have the opportunity to become familiar with these emerging methodologies. This environment enables this.
What happened when I did it For me, this project was a powerful example of the benefits for a four year comprehensive college in collaborating with a university center. A lot happened when I did this. Friendships were formed. Expertise was taken advantage of. The ability of a university center to build a sustainable environment that diverted almost no IT resources on the local campus was demonstrated. Best, we were able to create an accessible, sustainable data analysis and visualization environment that any SUNY can use. Most importantly for SUNY Oneonta, one that is accessible for undergraduates.
What I learned I learned that open source can and does work, and that it is possible to build out an environment on HUBzero technology that can replace much of what local academic computing people try to deploy and support. If nothing else, I learned that SUNY should try and incentivize taking advantage of the university center resources. Our Universities are capable of deploying IT environments that other SUNY campuses cannot and we can take advantage of this.
How others can use it
This publication is disseminated under the creative commons license AttributionNoncommercial-Share Alike 3.0
If you want to try out this environment go to http://vidia.ccr.buffalo.edu and register for an account. Once you have an account, follow the instructions at: https://vidia.ccr.buffalo.edu/resources/42/download/icebreaker-RM-instructions-v3.txt This will lead you through a basic text analysis using prepared data and processes in Rapid Miner. You can also look through the resources posted in the environment or contact me at jim.greenberg@oneonta.edu and Jim be happy to get you started.
The Open SUNY Center for Online Teaching Excellence
July 11, 2014 • Volume 1 • Issue 3