Watertown's Education Resource Strategies tried solving its problem in-house. Every solution the nonprofit concocted, however, proved to be labor-intensive. What the team needed was a seasoned data science team, which is where Harvard-spun startup DrivenData came in.
ERS helps school districts use their resources more strategically by providing them with a way to compare their spending to other school districts. Before partnering with DrivenData, that process involved assigning every line item to certain categories in a comprehensive financial spending framework — a task that required an average 400 man-hours per project, and limited the nonprofit's ability to give school districts the analysis they need to improve.
By working with DrivenData, though, ERS could pose a question — "What algorithm will allow us to code financial files more accurately, more quickly and more cheaply?" — and receive hundreds of different solutions from experienced data scientists.
"Given the amount of time we invest in this process for so many of our projects, DrivenData's solutions seemed like an amazing return on our investment," said ERS Associate Dan Turcza, in response to why the team decided to take the chance on an unproven concept being built out of the Harvard Innovation Lab.
DrivenData is a for-profit social enterprise that hosts online competitions, with the goal of engaging a global community of data scientists in solving social problems. The startup was spawned by the co-founders' mutual desire to have real world data sets they could practice on that didn't just impact the commercial sector; if they were going to spend their time coding, they wanted to do it for social good.
Founders Peter Bull, Isaac Slavitt and Greg Lipstein launched DrivenData's first competition with ERS in late October, closing it at the start of 2015. The challenge was for data scientists to build an automated smart algorithm that would take data on how a school district spends its money and puts it into this apple-to-apple language. On the line was $7,500, with $5,000, $2,000 and $500 going to the first-, second- and third-place entries, respectively.
"The results were very good," Lipstein said, noting that, among the winning entries, participants from the U.S., Russia, Germany and Brazil were represented. "One cool aspect, which I didn't really foresee, is that data scientists from around the world came together to help solve this problem for a Boston-based nonprofit and for education."
Unlike a hackathon, where projects are rarely turned into something actionable, the DrivenData team will be working with ERS to operationalize the winning algorithms and create a tool the nonprofit can use. And to ensure they're using it effectively, Bull said they will be completing the implementation for the organization in-house, thereby saving ERS those 400 man-hours per project they were wasting.
When used together, the top three solutions submitted to the competition achieved an average tagging accuracy of 90 percent — a result Turcza admitted to being pleasantly surprised by.
"As a non-data-science expert, you don't have a very good intuitive sense of how predictable your data is," Turcza said. "My worst fear around the DrivenData work was that even the smartest models composed by experts would only get half of the answers right. So we're thrilled to have done so well, and want to thank our competitors for that."
Competitors thanked DrivenData and ERS, as well, however. As the winner wrote in a public forum:
Roughly 800 data scientists are now signed up to the site, according to Lipstein. Between those 800, thousands of solutions were submitted, each of which participants were able to see and learn from, meaning those who finished in 20th place could learn from what the top three winners did.
Additional contests are in the works and, although not yet ready to be announced, Bull hinted that DrivenData has "solid plans for competitions in the public health and public policy space," with monetary prizes, paid for by the partner organizations, attached.
"There's definitely more competitions," Bull assured.
And that's all eager data scientists need to hear.