Marshall IDX is a means to measure the popularity of words. A project I haven't started but a couple of months ago, the site is still beta, but you can have a look around already and check out some of the data.
The idea was given rise by the question: What or who is really how popular?
This question is not an easy one to answer, since media presence and popularity of both diverse issues and people oscillate greatly. It is indeed very difficult to determine the media popularity of a particular person or topic. Sometimes, popularity may seem indicated by the number of Google search results, which however is not a very precise way to collect information as the search engine lists all results found in the Net, no matter whether anybody has ever taken a look at these pages.
Google has, in fact, made an attempt to provide an indicator of popularity by creating Google trends, but there, the search is based on search requests through Google only, and so far, Google did not reveal any (search result) numbers and only allows to compare the terms to an index. Let's make a test on Google Trends and compare some of their results to ours:
Obviously, what people search for in the Net does not necessarily match with what is really popular: many people have probably never heard of "mysql" or "php" if they are not techies, but almost everybody who reads newspapers knows Obama and, throughout the world, perceives the word "terrorism" on a daily basis,. Knowing what people are searching for in the Net might be a very valuable information, but it has not much to do with popularity. Another way people try to find out about the popularity of someone is by counting search results:Results 1 - 10 of about 225,000,000 for obama. (0.13 seconds)
Results 1 - 10 of about 176,000,000 for mysql. (0.18 seconds)
Results 1 - 10 of about 45,100,000 for terrorism [definition]. (0.67 seconds)
Results 1 - 10 of about 8,070,000,000 for php [definition]. (0.19 seconds)
Results 1 - 10 of about 882,000,000 for war [definition]. (0.21 seconds)
Results 1 - 10 of about 1,220,000,000 for mp3 [definition]. (0.19 seconds)
Again, such a listing misses the point: looking at these numbers, php is 40 times more popular than "Obama", and "mp3" a more popular term than "war" and also three times as popular as "Obama". Popularity is rather indicated by media presence, what people are looking at and perceiving textually every day, which happens to come about through the media.
The Marshall Index is expressed in a number: one point represents one Million individuals that got in touch with a particular term in a 24 hours Time Window. For example if you search for the word „Olympics“, our service will calculate an index based on how often the word is mentioned right now in online media; via the Marshall Index it will henceforth also be possible to observe words and watch their development in the media over a particular time period, from years down to seconds. This service is suited for anybody interested in media impact, such as companies, personalities, advertising and public relation companies, investment companies, media companies, domainers, and many other industries, as well as the public itself. By providing charts of the Marshall Index, the tool becomes interesting for many types of comparative studies like investment, medicine, music, movies, politics, and many other fields.
To calculate the Index, a spider needs to search through the predefined media websites. There is only one Marshall Index, no matter in which country or in which language a media is published. For Germany for example, the maximum available number of Marshall points will be 82 because the country has a population of 63 Million adult people and a literacy rate of 100%.
First, all of these media-websites need to be analyzed so that the
spider only spiders the actual articles, no archives, and no forums or old articles.
The calculation for the points that will be added to the Index for each media will calculated depending on traffic to a website, frequency, and placement of the word on the specific site.
For example: The New York Times has 3,5 Million visitors today (we count every visitor even if they return several times each day, our traffic data is based rather on page views than on visitors, and we add points for the printed edition of the NYT), the total points we can add for a word on this day is 5,8 Million, the NYtimes online has 3,5 readers on this day and the printed newspaper has 2,3 Million readers.
now lets say the word "Microsoft" appears on the top page of the NyTimes as a title and 2 more times on the same page, the we will add 30% of the maximum points of the NYT for the Title and 10% for each time the word reappears on the same page, totaling to 50% of the maximum points, say 60% of 5,8 Million Readers is 2,9 Points because the assumption is that 2,9 Million people see the word Microsoft at this moment (always based on a 24hrs time window).