Posts tagged visualization

Word Spectrum


  Using Google’s enormous bigram dataset, I produced a series of visualizations that explore word associations. Each visualization pits two primary terms against each other. Then, the use frequency of words that follow these two terms are analyzed. For example, “war memorial” occurs 531,205 times, while “peace memorial” occurs only 25,699. A position for each word is generated by looking at the ratio of the two frequencies. If they are equal, the word is placed in the middle of the scale. However, if there is a imbalance in the uses, the word is drawn towards the more frequently related term. This process is repeated for thousands of other word combinations, creating a spectrum of word associations. Font size is based on a inverse power function (uniquely set for each visualization, so you can’t compare across pieces). Vertical positioning is random.
  
  To better achieve a even distribution, I normalized the frequencies of bigrams based on total primary term frequency. So, for example, in the case of war vs. peace, there are 81,839,381 bigrams starting with war and 31,263,375 bigrams starting with peace. If I render the spectrum without normalization, it ends up lopsided toward war (since the usage totals are so much higher). To compensate, I scale down all of war’s bigrams so that the overall frequencies are even.


» via Chris Harrison

Word Spectrum

Using Google’s enormous bigram dataset, I produced a series of visualizations that explore word associations. Each visualization pits two primary terms against each other. Then, the use frequency of words that follow these two terms are analyzed. For example, “war memorial” occurs 531,205 times, while “peace memorial” occurs only 25,699. A position for each word is generated by looking at the ratio of the two frequencies. If they are equal, the word is placed in the middle of the scale. However, if there is a imbalance in the uses, the word is drawn towards the more frequently related term. This process is repeated for thousands of other word combinations, creating a spectrum of word associations. Font size is based on a inverse power function (uniquely set for each visualization, so you can’t compare across pieces). Vertical positioning is random.

To better achieve a even distribution, I normalized the frequencies of bigrams based on total primary term frequency. So, for example, in the case of war vs. peace, there are 81,839,381 bigrams starting with war and 31,263,375 bigrams starting with peace. If I render the spectrum without normalization, it ends up lopsided toward war (since the usage totals are so much higher). To compensate, I scale down all of war’s bigrams so that the overall frequencies are even.

» via Chris Harrison

A Day in the Life of NYTimes.com

Have you ever wondered where the readers of The New York Times’s Web site come from, and what kind of devices they use to read our content? In a past life, not too long ago, when I worked in The Times’s research and development labs, we started a research visualization project to explore this very topic. I worked on these visualizations with Michael Young, Michael Kramer, and Noriaki Okada.

The two videos below show the traffic to NYTimes.com on June 25, 2009, the day Michael Jackson died. The 24-hour period is compressed into a little over a minute and a half.

The top video represents readers coming to the Web site from the United States. The second video shows a map of our global readers. The circles indicate two things. First, the yellow circles represent readers coming to the main Web site from desktop or laptop computers, and the orange circles indicate readers using mobile phones to access our mobile site. Second, the size of the circles represents the number of readers at that moment in time. You can see the corresponding time stamp in the upper left corner of the videos.

Just watching these maps glow can be a mesmerizing experience, but there’s another fascinating piece of data within this particular day. At about 1 minutes 10 seconds into the video, at 5:20 p.m., you can see a huge pulse of readers coming to the Web site, both from mobile devices and personal computers. This huge traffic bump happened after TMZ.com broke the news of Mr. Jackson’s death. As the news started to filter across the Internet, traffic continued to ebb and flow throughout the evening.

» via The New York Times

smarterplanet:


Nebul.us  Shows You Your Activity on the Web | FlowingData
Nebul.us is an online application, currently in private beta, that aggregates and visualizes your online activity. Enter your information for Twitter, Facebook, Flickr, etc and install a plugin in Firefox to record your browsing behavior. Get something that looks like the above, sort of a donut-polar area chart hybrid. Nebul.us calls it a cloud. (via
)

smarterplanet:

Nebul.us Shows You Your Activity on the Web | FlowingData

Nebul.us is an online application, currently in private beta, that aggregates and visualizes your online activity. Enter your information for Twitter, Facebook, Flickr, etc and install a plugin in Firefox to record your browsing behavior. Get something that looks like the above, sort of a donut-polar area chart hybrid. Nebul.us calls it a cloud. (via

)

courtenaybird:


socialsciencevisualized:

News Dots: Interactive map of how every story of the in the news is related. Includes major US news outlets only. Updated daily. Click, drag and zoom. slatest.slate.com

courtenaybird:

socialsciencevisualized:

News Dots: Interactive map of how every story of the in the news is related. Includes major US news outlets only. Updated daily. Click, drag and zoom. slatest.slate.com

roomthily:

Mentionmap visualizes your twitter network.

roomthily:

Mentionmap visualizes your twitter network.

On Visualizing Information

jingc:

There can be a directness and clarity to visual information that cuts through the noise, the smoke, and the walls of information around us. It can help us zoom in and see what really matters. Or what might be being hidden from us.

A1 Explorer Screencast

The Visible Archive is a research project on the visualisation of archival datasets, by Mitchell Whitelaw, Senior Lecturer in the Faculty of Arts and Design at the University of Canberra.

The “A1 Explorer” interface is based on a word frequency cloud-based on item titles, showing co-occurrences between related terms. A histogram shows the number of items with start dates in each year that are related to the selected keywords, while users can also select and request a visual record of any of the museum items in the collection. “What this shows is that given the opportunity, interactive visualization can provide not only insights into the structure and content of an archival collection; it can also provide an interface to the (digitised) collection itself.”

More information on the Visible Archive Series Browser

via information aesthetics

Series Browser Screencast

The Visible Archive is a research project on the visualisation of archival datasets, by Mitchell Whitelaw, Senior Lecturer in the Faculty of Arts and Design at the University of Canberra.

The “Series Browser” breaks up the collection in Agencies and Series. An agency corresponds to the organization responsible for some or all of the functions or legislation documented in records. Given that there are some 9,000 Agencies involved, the visualization uses their ID codes to color the squares: low Agency numbers have low hue values (red), while high Agency numbers have high hue values (blue to purple). The area of the inner (brighter) square is proportional to the numbers of items, while the area of the outer (duller) band is proportional to shelf meters used. The result is that Series that are physically small, but contain many items, appear with very thin borders. It also shows how that using Agencies offer a useful way to break the huge collection up into manageable-sized subsets: the vast majority of agencies record to fewer than 100 of the 57.5 thousand series.

More information on the Visible Archive Series Browser

via information aesthetics