Friday, April 18, 2014

Multimedia Of the People, By the People needs multimedia technology For the People

Flick: krazydad
Yesterday, our Multimedia Computing Group had a strategy day at the Delft Arts Center. These are the days that we take time—that strictly time speaking we cannot really spare from deadlines and projects—and step back and look at the larger picture of our group and of multimedia research. The location of the Arts Center is green and the Dutch April obliged us with sun, ideal to take time to contemplate and discuss the bigger questions.

My larger picture is this: I am interested in the digital reflexes of human thought, creativity, and communication. These reflexes lead to the generation of multimedia and interaction data. The natural group with interest in this data is the people that it represents, i.e., the people whose efforts and activities caused it to come into being. I create algorithms and technology that support people in getting the most out of “their” data.

Of course their is a lot of multimedia data out there, also including satellite images, medical images,  and surveillance video. These are, of course, also forms of multimedia. However, typically, the people who generate these data (or the people who create the systems that generate these data), are not themselves represented in the data. 

In my view, multimedia “Of the People, By the People” (i.e., data arising from the creativity and activity of a large number of general-population users) should be distinguished from special purpose multimedia. This means that we need a concept of multimedia systems "For the People” and that these should be the focus of special development effort. 

There are conceptual and algorithmic reasons why multimedia systems "For the People” should be developed as a separate class. One important point is that multimedia of the people and by the people is also generated for multiple purposes, giving rise to complexity not encountered in other systems. For example, a satellite image wouldn’t be expected to be able to fulfill to radically different goals, such as “education” and “entertainment”, whereas such multi-facetedness is quite common for a video on YouTube. We tackled such complexity in a recent publication, “Using Crowdsourcing to Capture Complexity in Human Interpretations of Multimedia Content”, but I will not discuss it further here.

Here, instead, I would like to focus on the ethical aspects of why multimedia “Of the People, By the People” should be the subject of dedicated research devoted to creating multimedia systems for users. My motivation was a conversation during yesterday's lunch that came scarily close to arriving at the question, “Do we really have time for ethics in our research anyway?” Given that it required a significant amount of sacrifice to make time for the strategy day, I was really not ready to come to the conclusion that, whoops, no, after all, no time for ethics. I tried to make the point that ethics does not necessarily require time, but rather it involves making explicit choices that bear our research forward in one direction as opposed to another. The difference between the directions may be subtle, but the implications are huge.

Differentiating multimedia “Of the People, By the People” from other types of multimedia is a way at looking at what we do that opens to the door to consideration of the ethical aspects of our research. In particular, if we target multimedia systems "For the People", our research makes a contribution to supporting the dialogue on an important issue of the digital age: the right of individuals and communities to benefit from the data produced as a result of their own creativity and activity. I am not asserting that creating multimedia technologies "For the People" is in itself an act that one can consider “ethical” Rather, that choosing to do so, and making this choice explicit, we support people working in other disciplines in defining, and finding answers, to the tough ethical questions.

Social multimedia sharing on the internet has led to the creation of incredible resources. YouTube and Flickr contain an enormous amount of material that teaches, records aspects of human life, or represents the product of creative talent. In order that multimedia resources are able to reach their full potential of serving the good of individuals and society as a whole, it is important that technical tools are developed that will allow these resources to be searched and browsed. 

A somewhat trivial way of understanding (or "feeling") the way in which the people who created these resources also have a right to them is this, let's try a simple exercise. Imagine Google “turned off” YouTube. Would you feel that something had been taken away from you? Would schools, communities, social relationships between people be damaged? The answer is clearly “yes”. I am not going to far as to make the argument that YouTube videos should be considered public goods. There are complicated intellectual property issues at play there. The point here is simple: “turning off” YouTube would have larger implications than of “just” a company making a business decision for business reasons.

Let’s look at a less trivial example: Imagine that Google does not turn off YouTube, but rather starts to spend less effort on maintaining it (i.e., that there are streaming problems, or that we can’t find what we are looking for.) That’s a slippery slope going down from what YouTube is today, to the point of “turning it off”. Google's position on that slope again has consequences for ability of people to access content that they themselves have created. In other words, that position goes beyond mere business decisions.

Finally, let's turn to where we are today: Google makes certain decisions about the types of multimedia search technologies to develop and not to develop for YouTube. These decisions have consequences. One decision in which these consequences are perhaps most clearly evident is that YouTube does not provide time-code level search functionality to videos. This means that, for example, a student studying data fitting, can’t directly search for short segments of lecture videos explaining least squares, but instead, must do a significant number of video-level searches. 

Of course it would be a strange, if not bad, business decision to make the search function of YouTube capable of precisely addressing specific questions. Users that come, get their question answered with one query, and leave again, are not going to hang around and click the ads that make such a critical contribution to revenue. 

The point is that decisions whether or not to develop certain technologies (and I can assure you that time-code level search is non-trivial) have implications for the ability of the community who created and contributed the content also to be able to use it. 

In order to drive this point home, let me give one more example, based on Wikipedia. There is wide-spread agreement that Wikipedia is peer-produced resource that is important for society world wide. In the case of Wikipedia, the importance of access technology in guaranteeing people’s ability to make use of the resource is clear. Imagine that all of Wikipedia were dumped into an enormous text file without structure and without a search function. Would it be useful to the larger community? Scarcely. The “right” to use Wikipedia is guaranteed not only by the Creative Commons license under which its content is created, but also by the structure of the Wikipedia site and the search functionality that it offers. 

As systems get more complicated, and involve processing of pixels and sound samples, we cannot guarantee that appropriate search and access technologies will be developed without significant effort. The “ethical” decision to make as a multimedia researcher in academia is to focus efforts on those research topics that lead to multimedia systems “For the People” that industry is not necessarily incentivized to address.

Note that I am not asserting that companies like Google, are not working to making access to user-contributed collections of multimedia better. Quite to the contrary. Flickr, for example, just announced integration of object recognition http://techcrunch.com/2014/04/17/yahoo-acquisitions-power-flickrs-new-object-recognition-search-editing-and-video-capture/).

However, in order to remain profitable, companies must pay careful attention to their bottom line. For this reason, it is important that actors outside of industry (such as universities) remain actively engaged in developing technologies that help people get the most of multimedia data that they themselves have created. Such actors experience no conflict of interest, and care thus free to stand squarely on the side of the people who created the data when conceiving and developing new multimedia algorithms.

How much time do we need to consider ethics in our research? Not that much. We greatly benefit from a few odd moments of discussion and reflection. Yesterday, I welcomed the chance to view the big picture, and to step back and ask if I, as a multimedia researcher, am truly serving the interests of the people whose effort and action gave rise to the data that I study in the first place.