Dozens of new citizen science projects are generated daily. Most of them include beautiful maps, fair amounts of crowdsourced data, amazing visualization tools. But how many of these projects provide a really good representation of the monitored projects? In other words: we can ask birdwatchers to monitor birds’ migration paths, but are we really going to get a realistic map?
It is hard to say, because quality benchmarks for crowdsourcing projects are not always available. Most of the times, the credibility of the crowdsourced data is a free parameter assigned by the reader, rather than an objective assessment accompanying the data. That is why EveryAware includes statistical physicists among the partners: one of the goals of the project is to extract some general quantitative laws from crowdsourced data. For example, we would like to design an efficient scheme for environmental communication based on mobile ICT. So we need good in vivo models for opinion spreading, rating algorithms, collaborative filtering systems etc. and we would try to devise at least some of them.
Luckily enough, we are not alone. After the great hype on the crowdsourcing of science, in the near future the science of crowdsourcing will be a major scientific focus. There is a growing community of researchers studying how a crowdsourcing scheme has to be designed in order to get the best results out of it, given a set of constraints inspired by the real user experience. My last Google Scholar Alert mailed me articles such as “So Who Won? Dynamic Max Discovery with the Crowd” by the group of Hector Garcia-Molina at Stanford. Such papers appear with increasing frequency. The number of published papers on “crowdsourcing” and “optimization” jumped from 181 to 513 between 2010 and 2011 (Google Scholar data). Moreover, the WWW, which already hosts many crowdsourcing platforms, provides an ideal testbed for such models. Theory and experiments, therefore, can now go hand in hand, as we would expect from a mature scientific discipline.