Harold Dwight Lasswell, the American political scientist states that a convenient way to describe an act of communication is to answer the following questions
Who
Says What
In Which Channel
To Whom
With what effect?
According to Laswell’s model of communication, content (of any kind, e.g. document or an image) is a piece of message which goes from communicator to the receiver through some channel and produces some effect (observable through some actions or change in behavior). Therefore, content comes to life only when consumers start interacting (e.g. reading, writing about, liking, sharing, basically any kind of action) with it. Without interactions, content has no meaning. Think of a creation which is never read or seen, what's the point in doing any analysis for that?
In the current machine learning tech world, these spokes of communication are disconnected from each other. For example, the current ML technologies try to retrieve information about content (through tasks such as sentiment analysis and keyword extraction), and "learn" about content from the content itself.
If we analyze the Laswell’s model a bit more, we find that we can define content by one or a combination of other factors of the model, for instance, by the effect it produces in a certain type of receiver. In the machine learning language, this essentially means, we can go from describing one point in one vector space in the other vector space.
One may ask, why is that even required?
There are quite a few reasons for this.
The first is that understanding content is incomplete without understanding who produced it and who is going to consume it, i.e., the creation and consumption workflows. Take the case of sentiment analysis, sentiment is not a core content characteristic. It depends at least on these 3 spokes - content, producer, and consumer. Think of the last time the same message was perceived differently by you and your friend. We can find an excellent exposition about this on the contents of Perception. If consumers consume content differently depending on factors other than present in content itself, then content representation is incomplete without encoding other factors of the communication model as illustrated by Laswell.
The second reason is that extracting content features from content itself requires a lot of supervision while at the same time, we are ignoring free sources of (unsupervised) data about that content - the creator who created that content, the consumers who consumed it, and the interactions which they generated while interacting with it (the channel is assumed to be fixed here generally, mobile app or web pages, e.g., blogs, social media like tweets and posts). Using these 3 other spaces, the hope is we should be able to encode content representation with lesser supervision.
Due to content per communicator being less, content representation in the communicator space is often sparse. Therefore, simplifying the above 3 spaces further, we get a further simplified version of Laswell’s model: Content-Consumer Interaction Model in which we encode only three spokes of Laswell model - what message, to whom, and with what effect. Here we assume channel characteristics to be constant and communicator space to be sparse and therefore ignored due to not being too informative. If any of those two assumptions do not hold, e.g. in social media and advertising where the communicator space is relatively more dense, we can add the other two spaces back to retain the five spaces.
While the above explanation is good from theoretical point of view, but any good theory is incomplete without any applications. Therefore, I will present a few examples of the model discussed above.
First is better content representation and understanding using the other two spokes of customer-content-interaction model. An example of this work is LearnAd where we showed that generated interaction data (without even consumer data) can improve content understanding and help in achieving state of the art performance in advertisement understanding by showing improvements in topic and sentiment classification and question answering. In a similar vein, there is work by Nora Hollenstein (1) , Barbara Plank (1), Badri Patro (1), etc. All of those excellent studies showed that interaction patterns can positively affect various content extraction tasks like named entity recognition and question answering.
A second direction of work in this domain is about understanding behavior or interaction (not content) patterns using the other two spokes of the consumer-content-interaction model. A representative example of this work is CatchyContent where we showed that first behavior can be predicted from content and consumer and second, content can be optimised to elicit certain behavior for some specific customer type.
These two examples showed that we can go from one space to the other and ask and answer questions which are not possible to answer in just one domain. For example, content representation for behavior elicitation, consumer representation in the content space and vice versa.
- Yaman K Singla,
Research Scientist at Adobe Media Data Science Research,
Google PhD Fellow at SUNY-Buffalo and IIITD