A researcher at the University of Arizona says she might have discovered a way to predict which college freshman are headed for failure, long before they flunk a class. However, the discovery has implications not just for college students, but for everyone in an era of "big data."
Sudha Ram, with the UA's Management and Information Systems Department, is enthusiastic about her recent project, which ingested the records of how thousands of first-year students used their CatCard student IDs on campus, and identified the 20 percent of students who didn't come back for a second year.
"I believe I've developed a cool method to use data, without revealing anything about the students without knowing anything about them by just using these data points," Ram said.
Her algorithm correctly identified the freshmen who chose not to return about 85 percent of the time.
Ram said she doesn't know exactly how the algorithm chose those students. The project relied on "machine learning."
"We quantify all these and we put all these features together and then the algorithm itself is a black box," she said.
Ram insisted her research did not violate anyone's privacy, because the university gave her "anonymized" data, which is scrubbed of card numbers, and names of the students or the places where they used their cards.
Even so, Ram's project drew immediate attention nationwide and on campus. Several news outlets picked up on it. A Fortune magazine commentator accused the UA of "spying" on its students.
Kay Mathiesen, an associate professor at the school of information, said if the university is releasing information about students, even if it's anonymized data, the students should be informed.
"I have a digital privacy class, and I talked to my students yesterday about this. Other than my super-tech-savvy students who just think they're always having data collected about them because they're inside the system, [they] were completely unaware that any of their CatCard data would have been collected," Mathiesen said.
But just as troubling to Mathiesen, who specializes in "information ethics," is what she sees as the creeping acceptance of data mining by business and government, with the goal of influencing behavior, from the products we buy, to whether a student stays in college.
"I'm worried about creating a set of citizens who think about the world in this way, that it's the job of other people in power to collect information about me and to make predictions about my behavior and then try to intervene and change my behavior " — UA associate professor Kay Mathiesen
While Ram's project was purely research, the University of Arizona does analyze other data to address potential dropouts. Angela Baldasare, the UA's assistant provost for institutional research, said her department uses attendance records, test scores, and other metrics. Several times a year it gives advisers a list of the students most at risk. Using that approach, the UA saw 83 percent of the last freshman class return, a three-point gain over the year before.
Baldasare said including Ram's CatCard research on student retention is a possibility.
"If we did, it would involve a larger campus conversation I think ... because then we're stepping outside of data that people typically expect would be used to support their academic success," Baldasare said.
That campus conversation would include disclosing to students that their CatCard usage on campus is being tracked, and she said students should be given the choice to opt out of being tracked if they wish.
Although the university records every use of every one of the 60,000 active CatCards, the current CatCard terms and conditions are silent about what happens to that data.
Part one of this story: what the CatCard data show.