Four Easy Visualization Mistakes to Avoid

http://blog.visual.ly/data-visualization-mistakes-to-avoid/

Creating a great visualization is not as hard as it seems. Provided you have some interesting data and an effective tool with which to visualize it, a little bit of thoughtful design will lead to a decent result. That said, there are some mistakes that are very easy to make, but can ruin even a thoughtfully-made piece. Here are four data visualization mistakes you should avoid. 

1. Serving the Presentation Without the Data

Which comes first: the presentation or the data? Oftentimes, in an effort to make a visualization more “interesting” or “cool”, designers will allow the presentation layer of a visualization to become more important than the data itself. The visualization captured below is an unfortunate casualty. A considerable amount of work went into it and there are parts that are informative, like the summary counters at the top left. However, without a scale or axis, the time series on the bottom right is meaningless and the 3D chart in the center is even more opaque. Tooltips (pop ups) would help, if they were there. Instead, this looks amazing, but does little.


(source)

2. Showing Too Much Detail

We all know the feeling of finding a dataset that is rich and easy to visualize, with numerous usable categorical and numerical fields. The temptation is to show everything at once, and allow users to drill down to the finest level of detail. Often, that actually makes a visualization superfluous because the user could simply look at the dataset itself if they wanted to see the finest level of detail. The trick, then, is to show enough detail to tell a story, but not so much that that story is convoluted and hidden. More data can be revealed as the reader progresses through the story. This visualization could have been great but there is so much detail it is hard to garner much information from it.

Browse more infographics.

 

(source)

3. Not Explaining the Interactivity

Enabling users to use and interact with a visualization makes it more engaging and engrossing. However, without telling them how to use that interactivity you risk limiting them to the initial view. How you label the interactivity is just as important as doing it in the first place. Usually, informing the user at the top of the visualization (or the part they will see first) is good practice, as is calling out the interaction on or near the tools that utilize it. This visualization is actually very interesting, but without labeling the interactivity, it is easy to overlook the fact that you can click the words at the bottom of the screen to change the view. Even using a common design concept such as underlining the words to associate a hyperlink, would have been helpful.

(source)

4. Failing to Experiment

Often, your first idea is not the best idea. It is easy to get excited about a visualization and then stick to the first vision that came to your mind when you saw the data. However, it is always best to start a visualization with a blank slate of ideas. Then shift perspectives: try ten or twenty different configurations and types of chart before you settle on one. The best part about experimentation is that it often forces new findings out of the data. This chart is not ineffective, but it would be much better if it used bar or even area charts to display its information. It is actually difficult to compare the magnitude of the different parts of the “blob” with this type of view.


(source)

There is no perfect visualization, but if you can manage to stay away from these four mistakes, yours will have a much better chance of getting close to perfection.

Ross Perez is a Data Analyst with Tableau Public, a free tool which allows people to put their data on the web in interactive charts and graphs. You can connect with him on Twitter.

Sharon Machlis | 22 free tools for data visualization and analysis

April 20, 2011 (Computerworld)

Got data? These useful tools can turn it into informative, engaging graphics.

You may not think you’ve got much in common with an investigative journalist or an academic medical researcher. But if you’re trying to extract useful information from an ever-increasing inflow of data, you’ll likely find visualization useful — whether it’s to show patterns or trends with graphics instead of mountains of text, or to try to explain complex issues to a nontechnical audience.

Want to see all the tools at once?

For quick reference, check out our chart listing 22 free data visualization tools.

There are many tools around to help turn data into graphics, but they can carry hefty price tags. The cost can make sense for professionals whose primary job is to find meaning in mountains of information, but you might not be able to justify such an expense if you or your users only need a graphics application from time to time, or if your budget for new tools is somewhat limited. If one of the higher-priced options is out of your reach, there are a surprising number of highly robust tools for data visualization and analysis that are available at no charge.

Here’s a rundown of some of the better-known options, many of which were demonstrated at the Computer-Assisted Reporting (CAR) conference last month. Others are not as well known but show great promise. They range from easy enough for a beginner (i.e., anyone who can do rudimentary spreadsheet data entry) to expert (requiring hands-on coding). But they all share one important characteristic: They’re free. Your only investment: time.

Data cleaning

Before you can analyze and visualize data, it often needs to be «cleaned.» What does that mean? Perhaps some entries list «New York City» while others say «New York, NY» and you need to standardize them before you can see patterns. There might be some records with misspellings or numerical data-entry errors. The following two tools are designed to help get your data in tip-top shape to be analyzed.

DataWrangler

What it does: This Web-based service from Stanford University’s Visualization Group is designed for cleaning and rearranging data so it’s in a form that other tools such as a spreadsheet app can use.

Click on a row or column, and DataWrangler will suggest changes. For example, if you click on a blank row, several suggestions pop up such as «delete row» or «delete empty rows.»

There’s also a history list that allows for easy undo — a feature that’s also available in Google Refine (reviewed next).

What’s cool: Text editing is especially easy. For example, when I selected «Alabama» in one row of sample data headlined «Reported crime in Alabama» and then selected «Alaska» in the next group of data, it led to a suggestion to extract every state name. Hover your mouse over a suggestion, and you can see affected rows highlighted in red.

Free data analysis

DataWrangler helps format table data so it can be better used and analyzed by other applications.
Click to view larger image.

Drawbacks: I found that unexpected changes occurred as I attempted to explore DataWrangler’s options; I constantly had to click «clear» to reset. And not all suggestions are useful («promote row to header» seemed an odd suggestion when the row was blank) or easy to understand («fold split 1 using 2 as key»).

And while the fact that DataWrangler is a Web-based service makes it convenient to use, don’t forget that it sends your data off to an external site — which means it isn’t an option for sensitive internal information. However, there are plans for a future release of a stand-alone desktop version. Another important thing to keep in mind is that DataWrangler is currently alpha code, and its creators say it’s «still a work in progress.»

Skill level: Advanced beginner.

Runs on: Any Web browser.

Learn more: There’s a screencast on the Data Wrangler home page. Also, see this post on using DataWrangler to format data (from Tableau Public’s blog).

Google Refine

What it does: Google Refine can be described as a spreadsheet on steroids for taking a first look at both text and numerical data. Like Excel, it can import and export data in a number of formats including tab- and comma-separate text files and Excel, XML and JSON files.

Free data analysis

Google Refine can make data ‘cleaner’ by helping to find errors or different versions of the same proper names. Click to view larger image.

Refine features several built-in algorithms that find text items that are spelled differently but actually should be grouped together. After importing your data, you simply select edit cells –> cluster and edit and select which algorithm you want to use. After Refine runs, you decide whether to accept or reject each suggestion. For example, you could say yes to combiningMicrosoft and Microsoft Corp., but no to combining Coach Inc. with CQG Inc. If it’s offering too few or too many suggestions, you can change the strength of the suggestion function.

There are also numerical options that offer quick and easy overviews of data distributions. This functionality can reveal anomalies that might be the result of data input errors — such as $800,000 instead of $80,000 for a salary entry, or it could expose inconsistencies — such as differences in the way compensation data is reported from entry to entry, with some showing, say, hourly wages and others showing weekly pay or yearly salaries.

Beyond data housekeeping, Google Refine offers some useful analysis tools, such as sorting and filtering.

What’s cool: Once you get used to which commands do what, this is a powerful tool for data manipulation and analysis that strikes a good balance between functionality and ease of use. The undo/redo list of every action you’ve taken lets you roll back when needed. And text functions handle Java-syntax regular expressions, allowing you to look for patterns (such as, say, three numbers followed by two digits) as well as specific text strings and numbers.

Finally, while this is a browser-based application, it works with files on your desktop, so your data remains local.

Drawbacks: Although Google Refine looks like a spreadsheet, you can’t do typical spreadsheet calculations with it; for that, you must export to a conventional spreadsheet application. If you’ve got a large data set, carve out some time in your day to go through all of Refine’s suggested changes, since it can take a while. And, depending on the data set, be prepared when looking for text items to merge: You’re likely to get either a lot of false positives or missed problems — or both.

Skill level: Advanced beginner. Knowledge of data analysis concepts is more important than technical prowess; power Excel users who understand data-cleaning needs should be comfortable with this.

Runs on: Windows, Mac OS X (if it appears to do nothing after loading on a Mac, point a browser manually to http://127.0.0.1:3333/ ), Linux.

Learn more: These three screencasts give a good overview of why and how you’d use Refine; there’s also fairly detailed documentation on the Google Code project area.

Statistical analysis

Sometimes you need to combine graphical representation of your data with heftier numerical analysis.

The R Project for Statistical Computing

What it does: R is a general statistical analysis platform (the authors call it an «environment») that runs on the command line. Need to find means, medians, standard deviations, correlations? R can handle that and much more, including «linear and generalized linear models, nonlinear regression models, time series analysis, classical parametric and nonparametric tests, clustering and smoothing,» according to the project website.

Free data analysis

The R Project for Statistical Computing provides a wide range of data analysis options.
Click to view larger image.

R also graphs, charts and plots results. There are numerous add-ons to this open-source project that significantly extend functionality. For users who prefer a GUI, Peter Aldhous, San Francisco bureau chief for New Scientist magazine, suggests RExcel, which offers access to the R engine through Excel.

What’s cool: There is a great deal of functionality in R, including quite a number of visualization options as well as numerical and spatial analysis.

Drawbacks: The fact that R runs on the command line means that users will have to take the time to learn which commands do what, and not all users will be comfortable with a text-only interface. In addition, Aldhous says those dealing with large data sets may hit a memory barrier (if so, there’s a commercial option from Revolution Analytics).

Skill level: Intermediate to advanced. Comfort with command-line prompts and a knowledge of statistics are a musts for the core application.

Runs on: Linux, Mac OS X, Unix, Windows XP or later.

Learn more: Try R for Statistics: First Steps (PDF) by Peter Aldhous, Hands-on R, a step-by-step tutorial (PDF) by Jacob Fenton, and the project’s own An Introduction to R. The R Statistics blog has a number of visualization samples.

Visualization applications and services

These tools offer a number of different visualization options. While some stick to conventional charts and graphs, many offer a range of other choices such as treemaps and word clouds. A few offer geographical mapping as well, although if you’re interested in maps, our sections on GIS/mapping focus specifically on that.

Google Fusion Tables

What it does: This is one of the simplest ways I’ve seen to turn data into a chart or map. You can upload a file in several different formats and then choose how to display it: table, map, heatmap, line chart, bar graph, pie chart, scatter plot, timeline, storyline or motion (animation over time). It’s somewhat customizable, allowing you to change map icons and style info windows.

Free data analysis

Google Fusion Tables is a user-friendly tool that makes it easy to map data.
Click to view interactive map.

There are some data editing functions within Fusion Tables, although changing more than a few individual cell entries can quickly become tedious. You can also join tables (which is important when the data you want to map is in multiple tables), and filter, sort and add columns and so on. There are also options to allow others to make comments on the data itself.

Mapping goes beyond just placing points, as many of us are accustomed to with Google Maps. Fusion tables can also map multiple polygons with variations in color based on underlying data, such as this intensity map showing the percentage of households with Internet access by state from 2007 U.S. Census bureau data.

The Knight Digital Media Center notes that a handy undocumented feature allows the use of Fusion Table’s «templating» export to generate a JSON file from data in other formats. JSON is required by some APIs and JavaScript libraries.

Unlike IBM’s Many Eyes, Google lets you designate your data as private or unlisted as well as public, although your data still resides on Google’s servers — a benefit or drawback, depending on whether server bandwidth costs or data privacy is more important to you.

What’s cool: Fusion Tables offers relatively quick charting and mapping, including geographic information system (GIS) functions to analyze data by geography. The service also automatically geocodes addresses, which is useful when trying to place numerous points on a map. This is an excellent tool for beginners and advanced beginners to use to get comfortable with analyzing data; it’s also a good fit for people who don’t program. For more advanced users, there’s an API.

Drawbacks: Functionality, customization and data capacity are all limited compared with desktop applications or custom code, and interacting with large data sets on the site can be sluggish. And it has its limitations — the site choked on March 11, the day of the devastating earthquake and tsunami in Japan. (It is still a Google Labs beta project.)

Skill level: Beginner.

Runs on: Any Web browser.

Learn more:Google Fusion Tables tour and several tutorials are available. We’ve also got some examples of what it can do in our story «H-1B Visa Data: Visual and Interactive Tools.» Also see the Fusion Tables Example Gallery.

Impure

What it does: Impure is sort of a Yahoo Pipes for data visualization, designed for creating numerous types of highly polished graphical representations of data using a drag-and-drop workspace. The service includes a library of objects and various methods, and — as with Yahoo Pipes — it allows you to click and drag to connect modules so that the output of one becomes the input of another. It was developed by Spanish analytics firm Bestiario.

What’s cool: Impure offers a highly visual interface for the task of creating visualizations — which is not as common as you might expect. It has a sleek user interface and numerous modules, including quite a few APIs that are designed to pull data from the Web. It features numerous visualization types that are searchable by keywords like numerictablesnodesgeometry andmap. And although it saves your workspaces to the Web, you can copy and save the code behind your workspaces locally, so you can back up your work or maintain your own libraries of code snippets.

Drawbacks: Users of Impure face a surprisingly steep learning curve despite its drag-and-drop functionality. The documentation is detailed in some areas, but lacking in others. For instance, while it was easy to find a list of APIs, it was more difficult to find basic instructions on how to use the workspace — or even figure out that there was a workspace, let alone how to use the various objects and methods.

Once you save your workspace, it’s on the public Web, although it’s unlikely that anyone else will be able to find it unless you share the URL. And I found some of the samples not all that helpful in understanding the underlying data, even if they were visually striking.

Skill level: Intermediate.

Runs on: Any Web browser.

Learn more: To get started, I’d suggest the videos «Interface Basics» (7 minutes) and «Workspaces and Code.» You can find a sample called The Pay Gap Between Men and Women Mapped at the website of British newspaper The Guardian.

Tableau Public

What it does: This tool can turn data into any number of visualizations, from simple to complex. You can drag and drop fields onto the work area and ask the software to suggest a visualization type, then customize everything from labels and tool tips to size, interactive filters and legend display.

Free data analysis

Tableau Public can turn data into any number of visualizations, from simple to complex.
Click to view interactive graphic.

What’s cool: Tableau Public offers a variety of ways to display interactive data. You can combine multiple connected visualizations onto a single dashboard, where one search filter can act on numerous charts, graphs and maps; underlying data tables can also be joined. And once you get the hang of how the software works, its drag-and-drop interface is considerably quicker than manually coding in JavaScript or R for most users, making it more likely that you’ll try additional scenarios with your data set. In addition, you can easily perform calculations on data within the software.

Drawbacks: In the free version of Tableau’s business intelligence software, your visualization and data must reside on Tableau’s site. Whenever you save your work, it gets sent up to the public website — which means you can’t save work in progress without running the risk that it will be seen before it’s ready (while Tableau’s site won’t deliberately expose your work, it relies on security by obscurity — so someone could see your work if they guess your URL). And once it’s saved, viewers are invited to download your entire workbook with data. Upgrading to a single-user desktop edition costs $999.

Not surprisingly, all that functionality comes at a cost: Tableau’s learning curve is fairly steep compared to that of, say, Fusion Tables. Even with the drag-and-drop interface, it’ll take more than an hour or two to learn how to use the software’s true capabilities, although you can get up and running doing simple charts and maps before too long.

Skill level: Advanced beginner to intermediate.

Runs on: Windows 7, Vista, XP, 2003, Server 2008, 2003.

Learn more: There are seven short training videos on the Tableau site, where you can also find downloadable data files that you can use to follow along.

You can see a sample in our article «Tech Unemployment Climbs; Self-employment Steady

Many Eyes

A pioneer in Web-based data visualization, IBM’s Many Eyes project combines graphical analysis with community, encouraging users to upload, share and discuss information. It’s extremely easy to use and very well documented, including suggestions on when to use what kind of visual data representation. Many Eyes includes more than a dozen output options — from charts, graphics and word clouds to treemaps, plots, network diagrams and some limited geographic maps.

You’ll need a free account to upload and post data, although anyone can browse. Formatting is basic: For most visualizations, the data must be in a tab-separated text file with column headers in the first row.

It took me about three minutes to create a bar chart of top H-1B visa employers.

Free data analysis

It takes just a few minutes to create online charts like this with Many Eyes.
Click to view larger image.

It took perhaps another minute to create a treemap of the same data.

Free data analysis

Many Eyes offers a number of ways to visualize data, such as treemaps.
Click to view larger image.

What’s cool: Visualization can’t get much easier, and the results look considerably more sophisticated than you’d expect based on the minimal amount of effort needed to create them. Plus, the list of possible visualization types includes explanations of the types of data each one is best suited for.

Drawbacks: Both your visualizations and your data sets are public on the Many Eyes site and can be easily downloaded, shared, reposted and commented upon by others. This can be great for certain types of users — especially government agencies, nonprofits, schools and other organizations that want to share visualizations on someone else’s server budget — but an obvious problem for others. (IBM does offer a contact form for businesses interested in hosting their own version of the software.) In addition, customization is limited, as is data file size (5MB).

Skill level: Beginner.

Runs on: Java and any modern Web browser that can display Flash.

Learn more: IBM’s website features pages explaining data formatting for Many Eyes andvisualization choices.

You can see some featured visualizations on the Many Eyes home page or browse through some of the tens of thousands of uploads. One interesting map shows popular surnames in the U.S. from the 2000 Census by Martin Wattenberg, one of the creators of Many Eyes.

VIDI

What it does: Although VIDI’s website bills this as a tool for the Drupal content management system, graphics created by the site’s visualization wizard can be used on any HTML page — no Drupal required.

Upload your data, select a visualization type, do a bit of customization selection, and your chart, timeline or map is ready to use via auto-generated embed code (using an iframe, not JavaScript or Flash).

Free data analysis

Graphics created by VIDI’s visualization wizard can be used on any HTML page — no Drupal required.
Click to view interactive graphic.

What’s cool: This is about as easy as Many Eyes — with more mapping options and no need to make your visualization and data set public on its website. There are quick screencasts explaining each visualization type and several different color customization options. And the file-size limit of 30MB is six times larger than Many Eyes’ 5MB maximum.

Drawbacks: Oddly, the visualization wizard was a lot easier to use than the embed code — my embedded iframe didn’t display while trying to preview it on the VIDI website; I needed to save the visualization and go to the «My VIDI» page to get embed code that actually worked. Also, as with any cloud service, if you’re using this for Web publishing, you’ll want to feel confident that the host’s servers can handle your traffic and will be available longer than your need to display the data.

Skill level: Beginner.

Runs on: Any Web browser.

Learn more: The VIDI home page features a link to an 11-minute video tutorial.

It took me less than five minutes to create a sample: a map of earthquakes of 7.0 magnitude or more since Jan. 1, 2000.

Zoho Reports

What it does: One of the more traditional corporate-focused business analytics offerings in this group, Zoho Reports can take data from various file formats or directly from a database and turn it into charts, tables and pivot tables — formats familiar to most spreadsheet users.

What’s cool: You can schedule data imports from sources on the Web. Data can be queried using SQL and can be turned into visualizations, and the service is set up for Web publishing and sharing (although if it’s accessed by more than two users, you will need a paid account).

Free data analysis

Zoho Reports provides traditional business charts and graphs.
Click to view larger image.

Drawbacks: Visualization options are fairly basic and limited. Interacting live with the Web-based data can be sluggish at times. Data files are limited to 10MB. I found the navigation confusing at times — for example, after I saved a copy of a sample database, I was told it was in the folder «My reports,» yet I had a hard time finding that.

Skill level: Advanced beginner.

Runs on: Any Web browser.

Learn more: There are video demos and samples on Zoho’s website.

Code help: Wizards, libraries, APIs

Sometimes nothing can substitute for coding your own visualization — especially if the look and feel you’re after can’t be achieved without an existing desktop or Web app. But that doesn’t mean you need to start from scratch, thanks to a wide range of available libraries and APIs.

Choosel (under development)

What it does: This open-source Web-based framework is designed for charts, clouds, graphs, timelines and maps. Right now, it is geared more for developers who create applications than it is for end users who need to save and/or embed their work; but there’s an interactive online demothat lets you quickly upload some data to visualize.

Free data analysis

Still under development, Choosel has potential as an easy way to create online graphics.
Click to view larger image.

What’s cool: As with Tableau Public, you can have more than one visualization on a page and connect them so that, for example, mousing over items on a chart will highlight corresponding items on a map.

Drawbacks: This is not yet an application that end users can use to store and share their work. And I found the online demo to be finicky about uploading data — even after I corrected field formats for dates (dd/mm/yyyy) and location (latitude/longitude) as documented, my data wouldn’t load until I had another text field added (rather than just having numerical fields). It was also unclear how to customize labels. This project shows promise if it’s further developed and documented.

Skill level: Expert

Runs on: Chrome, Safari and Firefox.

Learn more: There’s a short video called Choosel — Timeline and Basic Features and a sample titled Earthquakes With 1,000 or More Deaths Since 1900.

Exhibit

What it does: This spin-off of the MIT Simile Project is designed to help users «easily create Web pages with advanced text search and filtering functionalities, with interactive maps, timelines and other visualization.» Billed as a publishing framework, the JavaScript library allows easy additions of filters, searches and more. The Easy Data Visualization for Journalists page offers examples of the code in use at a number of newspaper websites.

Of course, «easy» is in the eye of the beholder — what’s easy for the professionals at MIT who created Exhibit might not be that simple for a user whose comfort level stops at Excel. Like most JavaScript libraries, Exhibit requires more hand-coding than services such as Many Eyes and Google Fusion Tables. On the other hand, Exhibit has clear documentation for beginners, even those with no JavaScript experience.

What’s cool: For those who are comfortable coding, Exhibit offers a number of views — maps, charts, timeplots, calendars and more — as well as customized lenses (ways to format an individual record) and facets (properties that can be searched or sorted). You’re much more likely to get the exact presentation you want with Exhibit than, say, Many Eyes. And your data stays local unless and until you decide to publish.

Drawbacks: For newcomers unused to coding visualizations, it takes time to get familiar with coding and library syntax.

Skill level: Expert.

Learn more: There are a number of examples you can look at, including Red Sox-Yankees Winning Percentages Through the YearsU.S. Cities by Population and others.

Note: There are numerous other JavaScript libraries to help create visualizations, such as the recently released Data-Driven Documents and the jQuery Visualize plug-in. Six Revisions’ list of20 Fresh JavaScript Data Visualization Libraries gives you an idea of how many there are to choose from.

Google Chart Tools

What it does: Unlike Google Fusion Tables, which is a full-fledged, self-contained application for uploading and storing data, and generating charts and maps, Chart Tools is designed to visualize data residing elsewhere, such as your own website or within Google Docs.

Free data analysis

Google Chart Tools offers both a wizard and an API for creating Web graphics from data.
Click to view larger image.

Google offers both a Chart API using a «simple URL request to a Google chart server» for creating a static image and a Visualization API that accesses a JavaScript library for creating interactive graphics. Google offers a comparison of data size, page load, skills needed and other factors to help you decide which option to use.

For the simpler static graphics, there’s a wizard to help you create a chart from some sample formats; it goes as far as helping you input data row by row, although for any decent-size data set — say, more than half a dozen or so entries — it makes more sense to format it in a text file.

The visualization API includes various types of charts, maps, tables and other options.

What’s cool: The static image chart is reasonably easy to use and features a Live Chart Playground, which allows you to tweak code and see your results in real time.

The more robust API lets you pull data in from a Google spreadsheet. You can create icons that mix text and images for visualizations, such as this weather forecast note, and what it calls a«Google-o-meter» graphic. The Visualization API also has some of the best documentation I’ve seen for a JavaScript library.

Drawbacks: The static charts tool requires a bit more work than some of the other Web-based services, and it doesn’t always offer lots of extras in return. And for the API, as with other JavaScript libraries, coding is required, making this more of a programming tool than an end-user business intelligence application.

Skill level: Advanced beginner to expert.

Runs on: Any Web browser.

Learn more: See Getting Started With Charts and Interactive Charts. There are also samples in the Google Visualization API Gallery.

JavaScript InfoVis Toolkit

What it does: InfoVis is probably not among the best known JavaScript visualization libraries, but it’s definitely worth a look if you’re interested in publishing interactive data visualizations on the Web. The White House agrees: InfoVis was used to create the Obama administration’sInteractive Budget graphic.

What sets this tool apart from many others is the highly polished graphics it creates from just basic code samples. InfoVis creator Nicolas García Belmonte, senior software architect at Sencha Inc., clearly cares as much about aesthetic design as he does about the code, and it shows.

InfoViz

This sunburst of a directory tree shows some of the visualization capabilities of the JavaScript InfoVis Toolkit. You can see a larger, interactive version on the InfoVis website.

What’s cool: The samples are gorgeous and there’s no extra coding involved to get nifty fly-in effects. You can choose to download code for only the visualization types you want to use to minimize the weight of Web pages.

Drawbacks: Since this is not an application but a code library, you must have coding expertise in order to use it. Therefore, this might not be a good fit for users in an organization who analyze data but don’t know how to program. Also, the choice of visualization types is somewhat limited. Moreover, the data should be in JSON format.

Skill level: Expert.

Runs on: JavaScript-enabled Web browsers.

Learn more: See demos with source code.

Protovis

What it does: Billed as a «graphical toolkit for visualization,» this project from Stanford University’s Visualization Group is one of the more popular JavaScript libraries for turning data into visuals; it’s designed to balance simplicity with control over the display.

What’s cool: One of the best things about Protovis is how well it’s documented, with plenty of examples featuring visualization and sample code. There are also a large number of sample visualization types available, including maps and some statistical analyses. This is a robust tool, capable of building graphics like this color-coded U.S. map with timeline slider.

Drawbacks: As is the case with other JavaScript libraries, it’s pretty much essential for users to have knowledge of JavaScript (or at least some other programming language). While it’s possible to copy, paste and modify code without really understanding what it’s doing, I find it difficult to recommend that approach for nontechnical end users.

Skill level: Expert.

Runs on: JavaScript-enabled Web browsers.

Learn more: Try the How-to: Get Started Guide. You can also find examples of the types of graphics you can build with Protovis at the Protovis Gallery.

GIS/mapping on the desktop

There’s a wide range of business uses for geographic information systems (GIS), ranging from oil exploration to choosing sites for new retail stores. Or, as The Miami Herald did for its Pulitzer Prize-winning coverage of Hurricane Andrew, you can compare maximum wind speeds with damage reports and building information (and perhaps discover, for example, that the worst damage didn’t happen in the areas suffering the heaviest winds, but in areas with a lot of new, shoddy construction).

Quantum GIS (QGIS)

What it does: This is full-fledged GIS software, designed for creating maps that offer sophisticated, detailed data-based analysis of a geographic regions.

The best-known desktop GIS software is probably Esri’s ArcView, a robust, well-supported application that costs quite a bit of money. The open-source QGIS is an alternative to ArcView.

Free data analysis

Quantum GIS (QGIS) offers full-fledged geospatial visualization and analysis on the desktop.
Click to view larger image.

As OpenOffice is to Microsoft Office, QGIS is to ArcView. ArcView enthusiasts argue that Esri’s offering is a couple of years ahead of open-source alternatives, has a better-developed interface, enjoys commercial support and is better suited for print output. But QGIS users say the open-source alternative is an excellent program that does a great deal of useful GIS work — and may even be better than ArcView when it comes to generating maps for the Web, thanks to a plug-in dedicated to generating HTML image maps.

What’s cool: QGIS has an enormous amount of GIS functionality, including the ability to create maps, overlay various types of data, do spatial analysis, publish to the Web and more. It can also be enhanced with plug-ins that add support for numerous undertakings, including geocoding, managing underlying table data, exporting to MySQL and generating HTML image maps.

Drawbacks: As with any sophisticated GIS application, learning to use this software entails a serious commitment of time and training. Even in hour-long hands-on sessions with first ArcView and then QGIS, I noticed things that were easier to do in the commercial option. For example, ArcView had a one-click «normalize» function to immediately calculate, say, the percentage of people 65 and over versus the total population from a data table with both columns; in QGIS, I needed to pull up a «field calculator» and create a new column with the formula to do that calculation myself.

Runs on: Linux, Unix, Mac OS X, Windows. (This is one case where installation is more complicated on OS X, since it requires manual installation of several dependencies. There’s a one-click installer for Windows.)

Skill level: Intermediate to expert.

Learn more: Timothy Barmann of The Providence Journal posted two very useful tutorials for the CAR conference that are still available: Introduction to QGIS and The Latest in Mapping With JavaScript and jQuery. Barmann also offers a sample: Rhode Island’s Ethnic Mosaic. Another resource to help you get started: QGIS Tutorial Labs from Richard E. Plant, professor emeritus at the University of California, Davis.

Note: If you’re interested in GIS and want to consider other free software options, download this PDF listing of Open Source/Non-Commercial GIS Products. And if you’re looking for a free open-source desktop GIS program that might be fairly easy to use, Jacob Fenton, director of computer-assisted reporting at American University’s Investigative Reporting Workshop, recommends taking a look at the System for Automated Geoscientific Analyses (SAGA) site. Finally, if analyzing geographic data in a conventional database sounds interesting, PostGIS«spatially enables» the PostgreSQL relational database, according to the site.

Web-based GIS/mapping

Most of us are familiar with mapping tools from major companies like Google (which has a number of third-party front ends such as Map A List, an add-on that adds info to a Google Map from a spreadsheet). There’s also Yahoo Maps Web Services and Bing Maps — all with APIs. But there are numerous other options from smaller organizations or lone open-source enthusiasts that were designed from the ground up to map geographic data.

OpenHeatMap

What it does: This user-friendly website generates color-coded maps; the colors change depending on underlying info such as population change or average income. It can also place markers on a map, varying the size of the markers based on a data table.

Free data analysis

OpenHeatMap is extremely easy to use for creating data-based maps, although there are still occasional bugs in this well-thought-out service. Click to view interactive graphic.

In addition to providing the Web-based service, author Pete Warden has also packaged OpenHeatMap as a jQuery plug-in for those who don’t want to rely on hosting at OpenHeatMap.com. However, not all data formats work correctly when hosted locally. «My recommended way is to embed the maps from the site,» Warden wrote via Skype chat.

What’s cool: It is astonishingly easy to create a color-coded map from many types of location data — even IP addresses (just use the column header ip_address).

It took me about 60 seconds to create a basic map from a spreadsheet of magnitude 7 or higher earthquakes around the world since Jan. 1, 2000, then a couple of minutes more to customize the rollover box to display both date and magnitude. (You can see a larger version on OpenHeatMap.com.)

Marker transparency, size and color are extremely simple to customize; you can also upload your own marker image, and customize what appears in the tooltips rollover by adding a tooltip column to your data source.

OpenHeatMap automatically figures out and maps locations based on a wide range of place definitions, relying on how the location columns are named — «address,» «country,» «fips_code» (used by the U.S. Census Bureau), «zip_code_area» (for five-digit ZIP codes), «lat» (latitude), «lon» (longitude) and so on.

This is a well-thought-out interface from a onetime Apple engineer. (Warden said he worked on several software projects at Apple, including Final Cut Studio.)

Drawbacks: There’s no way to delete data once it’s been uploaded (you can get around this by using a Google Spreadsheet as a data source), and editing time is limited to as long as your browser is open and you haven’t started a new map. Embedded OpenHeatMap.com-hosted maps may be slow to load.

The documentation doesn’t make it clear whether you can set where the map is centered or what the default zoom level should be; Warden told me by e-mail that the system remembers where you last positioned and zoomed the map before saving. And this feature still can occasionally be buggy, although Warden is responsive to bug reports.

Skill level: Beginner.

Runs on: Web browsers enabled for Flash or HTML 5 Canvas.

Learn more: Its title notwithstanding, the four-minute video «How OpenHeatMap Can Help Journalists» offers a clear explanation for anyone interested in using the service. You can also view samples on the OpenHeatMap Gallery and check out this Guardian interactive map of where Facebook is used.

OpenLayers

What it does: OpenLayers is a JavaScript library for displaying map information. It’s aimed at providing functionality similar to those big companies’ code libraries — but with open-source code. OpenLayers works with OpenStreetMap and other maps, as this tutorial about use with Google shows.

Other projects build on it to add functionality or ease of use, such as GeoExt, which adds more GIS capabilities. For users who are comfortable hand-coding JavaScript and prefer not to use a commercial platform such as Google or Bing, this can be a compelling option.

Drawbacks: OpenLayers is not yet as developed or as easy to use as, say, Google Maps. The project page notes that it is «still undergoing rapid development.»

Skill level: Expert.

Runs on: Any Web browser.

Learn more: Try this OpenLayers Simple Example. A good sample is Ushahidi’s Haiti map.

There are other JavaScript libraries for overlaying information on maps, such as Polymaps. And there are a number of other mapping platforms, such as Google Maps, which offers numerous mapping APIsYahoo Maps Web Services, with its own APIs; the Bing Maps platform and APIs; and GeoCommons.

OpenStreetMap

What it does: OpenStreetMap is somewhat like the Wikipedia of the mapping world, with various features such as roads and buildings contributed by users worldwide.

What’s cool: The main attraction of OpenStreetMap is its community nature, which has led to a number of interesting uses. For example, it is compatible with the Ushahidi mobile platform used to crowdsource information after the earthquakes in Haiti and Japan. (While Ushahidi can use several different providers for the base map layer, including Google and Yahoo, some project creators feel most comfortable sticking with an open-source option.)

Drawbacks: As with any project accepting public input, there can be issues with contributors’ accuracy at times (such as the helicopter landing pad someone once placed in my neighborhood — it’s actually quite a few miles away). Although, to be fair, I’ve encountered more than one business listing on Google Maps that was woefully out of date. In addition, the general look and feel of the maps isn’t quite as polished as commercial alternatives.

Skill level: Advanced beginner to intermediate.

Runs on: Any Web browser.

Learn more: See the Quick Tutorial on the OpenLayers site.

Temporal data analysis

If time is an important component of your data, traditional timeline visualizations may show patterns, but they don’t allow for sophisticated analysis or a great deal of interaction. That’s where this project comes in.

TimeFlow

What it does: This desktop software is for analyzing data points that involve a time component. In a demo I wrote about last summer, creators Fernanda Viégas and Martin Wattenberg — the pair behind the Many Eyes project who are now working at Google — showed how TimeFlow can generate visual timelines from text files, with entries color- and size-coded for easy pattern spotting. It also allows the information to be sorted and filtered, and it gives some statistical summaries of the data.

Free data analysis

TimeFlow offers a number of different ways to easily visualize data with an important time component.
Click to view larger image.

What’s cool: TimeFlow makes it incredibly easy to interact with data in various ways, such as switching views or filtering by criteria such as date ranges or earthquakes of magnitude 8 or more. The timeline view offers a slider so you can zero in on a time period. While many applications can plot bar graphs, fewer also offer calendar views. And unlike Web-based Google Fusion Tables, TimeFlow is a desktop application that makes it quick and painless to edit individual entries.

Drawbacks: This is an alpha release designed to help individual reporters doing investigative work. There are no facilities for publishing or sharing results other than taking a screen snapshot, and additional development appears unlikely in the near future.

Skill level: Beginner.

Runs on: Desktop systems running Java 1.6, including Windows and Mac OS X.

Learn more: Check out Top tips.

Note: If you’re looking to publish visualized timelines, better options include Google Fusion Tables, VIDI or the SIMILE Timeline widget.

Text/word clouds

Some data visualization geeks think word clouds are either not very serious or not very original. You can think of them as the tiramisu of visualizations — once trendy, now overused. But I still enjoy these graphics that display each word from a text file once, with the size of the words varying depending on how often each one appears in the source.

IBM Word-Cloud Generator

What it does: Several tools mentioned previously can create word clouds, including Many Eyes and the Google Visualization API, as well as the website Wordle (which is a handy tool for making word clouds from websites instead of text files). But if you’re looking for easy desktop software dedicated to the task, IBM’s free Word-Cloud desktop application fits the bill.

What’s cool: This is a quick, fun and easy way to find frequency of words in text.

Drawbacks: Because it’s trying to ignore words such as «a» and «the,» the basic configuration can miss some important terms. In my tests, it didn’t know the difference between «it» and «IT,» and completely missed «AT&T.»

Skill level: Advanced beginner. This app runs on the command line, so users should have ability to find file paths and plug them into a sample command.

Runs on: Windows, Mac OS X and Linux running Java.

Learn more: Check the examples that come with the download.

Social and other network analysis

These tools use a pre-Facebook/Twitter definition of «social network analysis» (SNA), referring to the discipline of finding connections between people based on various data sets. Investigative journalists have used such tools to, for example, find links between people who are involved in development projects or who are members of various boards of directors.

An understanding of statistical theories of network node analysis is necessary in order to use this category of software. Since I’ve only had a very basic introduction to that discipline, this is one category of tools I did not test hands-on. But if you’re seeking software to do such analysis, one of these might meet your needs.

Gephi

What it does: Billed as a Photoshop for data, this open-source beta project is designed for visualizing statistical information, including relationships within networks of up to 50,000 nodes and half a million edges (connections or relationships) as well as network analyses of factors such as «betweenness,» closeness and clustering coefficient.

Free data analysis

Gephi can visualize networks of up to 50,000 nodes.
Click to view larger image.

Runs on: Windows, Linux, Mac OS X running Java 1.6.

Learn more: Try this Quick Start tutorial (PDF).

NodeXL

What it does: This Excel plug-in displays network graphs from a given list of connections, helping you analyze and see patterns and relationships in the data.

NodeXL merges the older and current definitions of SNA. It’s «optimized for analyzing online social media — it includes built-in connections to query the APIs of Twitter, Flickr and YouTube, allowing you to draw networks of users and their activity,» according to Peter Aldhous, San Francisco bureau chief for New Scientist magazine.

It also handles e-mail and conventional network analysis files (including data created by the popular — but not free — analysis tool UCINET).

Runs on: Excel 2007 and 2010 on Windows.

Learn more: Download this detailed free NodeXL tutorial (PDF) or these basic step-by-step instructions on analyzing your own Facebook social network (PDF). One Facebook app for downloading your own friend information for use in NodeXL is Name Gen Web.

Sharon Machlis is online managing editor at Computerworld. Her email address issmachlis@computerworld.com. You can follow her on Twitter Twitter @sharon000, on Facebook or by subscribing to her RSS feeds:
articles Machlis RSSblogs Machlis RSS.

Stephen Few | El periodismo consiste en facilitar la vida de los lectores, no en entretenerlos (por @albertocairo)

Entrevista realizada por @albertocairo en su blog Periodismo con Futuro (El País, 29/08/2011)

Antes de la explosión informativa propiciada por blogs y redes sociales, las opciones para estudiar los principios de la visualización y la infografía se limitaban a las obras de un puñado de pioneros de procedencias diversas: estaban los estadísticos John W. Tukey y William S. Cleveland, los cartógrafos Jacques Bertin y Alan Mac Eachren y los científicos sociales Howard Wainer y Edward T. Tufte.

También estaba Stephen Few.

Los libros de Few, principalmente Show Me the Numbers y Now You See It, funcionan como puentes entre los escritos (a veces excesivamente abstractos) de los teóricos del primer párrafo y la experiencia diaria del profesional. Si Edward Tufte, su principal influencia, se rinde a menudo a un estilo mesiánico y sentencioso, Few, sin perder precisión en el manejo de conceptos, es práctico y directo, una probable herencia de su pasado como consultor en inteligencia empresarial.

Alberto Cairo: En marzo de 2011, usted fue jurado de los premios internacionales de infografía Malofiej, que distinguen cada año lo mejor de la visualización en periódicos y revistas. Durante una conversación, me dijo que nunca había estado expuesto de forma tan intensa a este tipo de gráficos, ya que su especialidad es el análisis cuantitativo de empresas. ¿Qué es lo que le llamó la atención del trabajo de los muchos periodistas visuales que se presentan a este certamen?

Stephen Few: A pesar de que tuve la suerte de evaluar muchos buenos proyectos, también percibí que los errores más comunes en gráficos periodísticos son muy parecidos a los que se cometen en el mundo empresarial. Los diseñadores de infografías tienden a imitar lo que ven en otras publicaciones y no reflexionan sobre si se trata de soluciones adecuadas para los datos que manejan. Vi muchos gráficos que imitaban sinsentidos como este, de la revista Good:

Threetrilliondollarwar

AC: Los periodistas visuales se dejan llevar por las modas.

SF: Ese es un problema serio, desde luego. Hoy en día existen diseñadores que piensan: «de acuerdo, tengo veinte valores que quiero que el lector compare. ¡Voy a crear un gráfico de burbujas!» Se lanzan entonces a dibujar un montón de círculos organizados de una forma estéticamente agradable, sin estudiar si esos datos poseen un orden natural y sin tener en cuenta que el cerebro humano no es bueno comparando áreas, sino solo alturas y longitudes, por lo que un gráfico de barras sería más adecuado.

Me dio la impresión de que muchos profesionales toman los datos y se dedican simplemente a buscar una forma divertida y original de mostrarlos, en vez de entender que el periodismo consiste -una vez reunidas las informaciones- en facilitar la vida de los lectores, no en entretenerlos. El trabajo del diseñador de información no es encontrar el gráfico más novedoso, sino el más efectivo.

ShowMeTheNumbersAC: Creo que la causa de este fenómeno es que bastantes periodistas y diseñadores mantienen una mentalidad paternalista: piensan que si optan por gráficos más funcionales y densos, acabarán siendo también visualmente pobres, de aspecto técnico, frío, poco atractivo.

SF: Atraer la atención con algo no relacionado con la información, como adornos y efectos especiales, es un truco contraproducente. En periodismo, ¿cómo se gana uno al lector? Con un buen titular, algo que resulta llamativo pero que, ante todo, está basado en el texto que encabeza. El lector no leerá un reportaje o noticia porque su titular resulte bello, sino porque este sugiere que la historia a la que acompaña es interesante.

Lo mismo ocurre con los gráficos. Uno puede crear algo visualmente espectacular, lleno de elementos decorativos irrelevantes pero, ¿qué gana con ello? Nada. Por lo menos, nada que tenga que ver con el objetivo central de una historia periodística, que es informar. La decoración atraerá al lector, pero este perderá inmediatamente el interés cuando compruebe que el contenido es irrelevante o ilegible. Esta es la estrategia del diseñador perezoso, dispuesto a sacrificar la integridad de los datos en favor de lo lúdico, del mero entretenimiento.

Sospecho que estamos ante una consecuencia de que los medios de comunicación, cuando tienen que contratar personal para sus departamentos de gráficos, elijan solo gente que procede de facultades de Arte y con experiencia en software, y no analicen si poseen la formación necesaria para contar historias a partir de datos. Pueden ser buenos maravillando con pirotecnia visual, pero esa es una habilidad más propia de la publicidad que del periodismo.

AC: ¿Es ese el caso de David McCandless, el famoso diseñador británico, a quien ha criticado en varios artículos recientes ([1], [2])?

SF: En la semana que pasé como jurado de los premios Malofiej, tuve que leer muchas infografías parecidas a lo que McCandless ha publicado en The Guardian y Wired. Lo que me deja perplejo de la popularidad que ha conseguido es que su trabajo no es interesante ni como diseño ni como visualización. Sus gráficos suelen ser superficiales y, desde el punto de vista estructural, están llenos de errores. Fíjese en una de sus infografías más celebradas, la titulada The Billion Pound-O-Gram:

BillionPoundoGram

El objetivo del gráfico es revelar que el tamaño del déficit de Gran Bretaña (175.000 millones de libras, el cuadro negro a la derecha) es grande, comparado con otros números con los que el lector del Guardian está familiarizado. Pero este es un mensaje algo intrascendente, que se resume en pocas palabras. ¿Ayuda el gráfico a poner la cifra en contexto? ¿Nos permite profundizar y analizar los datos desde diferentes puntos de vista? ¿La información está estructurada para facilitar comparaciones? Intente responder a las siguientes preguntas sin leer los números que aparecen dentro de cada rectángulo:

• ¿Qué es mayor en términos económicos: los préstamos hipotecarios (Mortgage Lending) o el Servicio Nacional de Salud (NHS)?

• ¿Cuál es la diferencia entre los préstamos hipotecarios y las pensiones públicas (State Pensions)?

• ¿La diferencia entre el gasto en pensiones públicas y los ingresos de Tesco (una cadena de supermercados) es parecida a la diferencia entre 62 y 59 o mucho mayor?

• ¿Qué es mayor: prestaciones por desempleo (Income Support) o gasto en policía (Police)?

Podría argumentarse que esas comparaciones no son demasiado relevantes, dado que el gráfico se centra en el tamaño del déficit. Así que, a continuación, intente decirme cuánto mayores o menores que el déficit son los rectángulos Income TaxBailout: Asset Purchasing and Lending. Es casi imposible.

Todas esas cuestiones son muy fáciles de responder con un simple gráfico de barras. Vea lo sencilo que resulta clasificar las cantidades de mayor a menor y percibir la proporción entre cada una de ellas y el déficit, representado por una gruesa línea negra vertical.

RedesignBillion
AC: A veces uso una analogía para explicar que la elección de la forma de un gráfico tiene que ser restringida por las tareas en las que ese gráfico debe ayudar al lector. En otras palabras, una infografía es una herramienta cognitiva. Pensemos en un martillo: existen martillos grandes, pequeños, bellos y feos, fabricados con materiales diferentes; sin embargo, en el fondo, todos tienen una forma similar. ¿Cómo convencer a periodistas y diseñadores de que la prioridad tiene que ser la estructura y no la estética (que puede ser una preocupación posterior)? ¿Cómo persuadirlos de que es aconsejable actuar más como ingenieros que como artistas?

SF: Para mí es más o menos sencillo, dado que suelo dirigirme a audiencias corporativas, que ya han asumido que un gráfico es un instrumento de estudio y análisis, no un mero adorno para rellenar una página o para aligerar una presentación en PowerPoint. Ahora bien, también es cierto que constantemente me encuentro con gente fascinada por lo que yo llamo arte basado en datos (Data Art).

El arte basado en datos y la visualización de esos mismos datos son disciplinas diferentes. No tengo nada contra el arte, desde luego: hace unos días fui a una exposición en el Museo de Arte Contemporáneo de San Francisco y me maravillé ante varias obras de Henri Matisse y de Picasso. Cuando veo un cuadro de Picasso, sin embargo, no tengo la expectativa de extraer una historia clara, sino de que me conmueva. Ocurre lo contrario al usar una visualización: espero que me ayude a entender la verdad que se esconde tras cantidades enormes de números.

Nowyou see it AC: Esa puede ser una buena filosofía para el periodista visual.

No, esa debe ser la filosofía del periodista visual, si uno entiende el periodismo como aquella actividad profesional cuyo objetivo es proporcionar a los ciudadanos la información que necesitan para tomar mejores decisiones públicas y privadas.

AC: Supongamos que, en vez de hablar para una audiencia compuesta por ejecutivos y mandos intermedios de una empresa, nos dirigimos a periodistas sin experiencia alguna en visualización. ¿Cuáles son los pasos fundamentales en la creación de un buen gráfico?

SF: No creo que el procedimiento sea muy diferente al que un analista de empresas adopta para estudiar datos y presentarlos a sus colegas. Lo primero que hace es preguntarse cuáles son los mensajes que esos datos transmiten, o deben transmitir. Un periodista, de la misma manera, debe pensar en la historia que los datos ocultan y que puede ser de interés y utilidad para sus lectores. Si uno es capaz de resumir esta historia con palabras, ya habrá avanzado mucho.

El siguiente paso es buscar los tipos de gráfico mejor adaptados a la naturaleza de la información. ¿Quiero que mis lectores comparen las tasas de infección por VIH en varios países de África? Es posible, entonces, que lo más adecuado sea un gráfico de barras, organizadas de mayor a menor, como en el ejemplo del que hablamos anteriormente.

Para tomar esta decisión, uno debe aprender el vocabulario y la gramática de la visualización: cómo formas geométricas codifican conceptos abstractos y valores cuantitativos. Es imprescindible familiarizarse con el trabajo de gente como Jacques Bertin, en su Semiology of Graphics, Colin Ware, de la Universidad de New Hampshire, que ha escrito dos maravillosos libros sobre el procesamiento cerebral de información gráfica ([1], [2]), y Stephen Kosslyn, tal vez comenzando con su reciente entrevista. Tanto Ware como Kosslyn profundizan en la relación entre presentación, percepción y cognición. Si la aspiración de un periodista es hacerse entender por su público, debe estudiar cómo funciona el cerebro humano.

Un último paso fundamental: desarrollar el sentido crítico. Si los estudios de usabilidad forman parte desde hace tiempo del diseño de software y de páginas Web, también pueden ser incorporados a la visualización y la infografía.

AC: En su blog escribe sobre tendencias que considera negativas, pero también destaca ejemplos de buen hacer. ¿Alguna recomendación de innovaciones, proyectos y profesionales a los que merezca la pena seguir con atención?

SF: El mundo de la visualización está cambiando muy rápidamente. Una novedad que me viene a la cabeza de inmediato es el impulso que están recibiendo las redes de visualización colaborativa, en las que grupos de científicos y expertos trabajan sobre los mismos datos desde diferentes lugares del mundo. Un ejemplo muy conocido, accesible para el usuario medio, es Many Eyes, de IBM.

En cuanto a formas gráficas nunca vistas, una compañía llamada Panopticon me sorprendió hace algún tiempo con un nuevo tipo de visualización, llamada gráfico de horizonte (horizon graph). Permite comparar más de cincuenta variables a lo largo del tiempo gracias a que los números no se representan solo por medio de la altura de las líneas, sino también con tonos de color. Uno tarda un poco en acostumbrarse a leerlo pero, como expliqué en un artículo, puede resultar muy útil.

Libros recientes… Tal vez Visual Language for Designers, de Connie Malamed. Está entre los que más me han gustado en los últimos años. Los principios que describe son un buen punto de partida para cualquier carrera en visualización periodística.

Alberto Cairo (Twitter: @albertocairo) es director de infografía y multimedia de la revista Época (Editora Globo, Brasil)

Stephen M. Kosslyn | Las facultades de Periodismo y Diseño deberían enseñar Psicología Cognitiva (por @albertocairo)

Por: Alberto Cairo (@albertocairo) en Periodismo con futuro (elpais.com)

Kosslyn1
(Foto: Microsoft)

Stephen M. Kosslyn es un maestro de maestros. A pesar de que durante nuestra conversación telefónica no le pregunto por sus estudiantes, ni siquiera por los más famosos (la actriz Natalie Portman fue asistente en su laboratorio de Neuropsicología en Harvard), no evita mencionar a algunos, como Steven Pinker, (Cómo funciona la mente), de quien Eduard Punset dijo que debería ser candidato al premio Nobel.

Kosslyn habla de sus alumnos con cierto pudor y frases lentas, meditadas; en tono parecido, explica por qué los periodistas debemos preocuparnos por cómo el cerebro de cada lector se enfrenta a textos, gráficos y fotografías, partiendo del aprendizaje de ciertos rudimentos de psicología cognitiva, que estudia la interrelación entre percepción, memoria y conocimiento.

El curriculum de Kosslyn ocupa varias páginas, así que limitémonos a sus credenciales más notables: director del Centro de Estudios Avanzados en Ciencias del Comportamiento de la Universidad de Stanford desde enero de este año; fue profesor emérito en Harvard, jefe del Departamento de Psicología de la misma institución y, entre 2008 y 2010, decano de su área de Ciencias Sociales. Neurocientífico especializado en percepción e imágenes mentales (mental imagery), Kosslyn ha escrito varios libros sobre comunicación visual, además de numerosos artículos académicos ([1][2]).

Kosslyn2

Alberto Cairo – Sus libros de divulgación, como Graph Design for the Eye and MindDiseño de gráficos para el ojo y la mente»), se ocupan de la aplicación de la psicología cognitiva a la comunicación¿Por qué se interesó por este asunto?

Stephen M. Kosslyn – Hace años, cuandoSteven Pinker era estudiante de doctorado en Harvard, una empresa llamada Consulting Statisticians Inc. se puso en contacto conmigo. Una agencia gubernamental les había pedido que hiciesen algunos estudios sobre por qué los gráficos estadísticos funcionan tan bien para comunicar ciertos tipos de información. Steve y yo pasamos tres años investigando, aunque nunca llegamos a publicar los resultados en un único volumen. Todos mis libros de tono menos académico y orientación más práctica tienen su origen en esa época.

AC – ¿Los que tratan de PowerPoint también?

SK– Sí. De hecho, están basados en los mismos principios y reglas que los que hablan de gráficos. Cuando era director del Departamento de Psicología de Harvard, una de mis obligaciones era ir a numerosas conferencias de profesores visitantes, candidatos a empleos en la universidad, etc. Un día, comencé a notar que las proyecciones usadas en aquellos actos no respetaban lo que conocemos sobre cómo funciona la mente.

Recuerdo un caso que me llamó especialmente la atención. Se trataba de una clase magistral impartida por un psicólogo. Comenzó a mostrar imágenes del Sistema Solar: fondos negros cargados de estrellas, planetas de colores muy claros, etc. El problema era que sobre dichos planetas había colocado textos diminutos de color blanco y escritos con fuentes de trazos muy finos. Eso hacía que fuesen no sólo muy difíciles de ver, sino también de leer. Fue una revelación. Me dije: «esto es increíble; puedo identificar un buen montón de problemas muy básicos; estas diapositivas ignoran lo que conocemos sobre la importancia del contraste, por ejemplo. ¿Por qué?» Así que comencé a tomar notas.

Kosslyn3

AC – Todo su trabajo de divulgación es producto de ese «instante eureka»…

SK– Oh, no, no fue sólo aquella conferencia. Después de ella, comencé a prestar atención no sólo al contenido de otras charlas, sino también a la forma en la que se utilizaban apoyos visuales. Todas, incluso las de expertos en percepción, tenían fallos graves.

Eso fue lo que me llevó a escribir Clear and to the Point, mi libro sobre principios de psicología aplicados a PowerPoint. Lo menciono porque algunas de las reseñas señalaron que lo que digo es bastante obvio: sé claro y directo, no intentes colocar demasiada información en cada diapositiva, organiza jerárquicamente los elementos en la página, usa correctamente el contraste… Pero, a juzgar por lo que me veía obligado a sufrir cada día, los consejos que di eran necesarios. Toda persona que tenga que enfrentarse a una audiencia debe familiarizarse con cómo funcionan la percepción, la memoria y los mecanismos del razonamiento. Mi objetivo siempre ha sido ayudar en ese proceso.

AC – ¿Tiene sentido entonces incluir asignaturas de Psicología Cognitiva en las facultades de Periodismo y en las escuelas de Diseño Gráfico? Le advierto de que debo de estar saltándome varias reglas éticas con esta pregunta porque tengo un interés personal en ella; es una de mis propuestas para el futuro de la enseñanza de esas profesiones…

SK No tengo la menor duda. Es aconsejable por varios motivos. El primero es que los periodistas se dirigen siempre a audiencias compuestas de seres humanos. Otra obviedad, ¿no es cierto? No tanto. La percepción, el procesamiento de información procedente de los sentidos, la comprensión y la memoria tienen muchas limitaciones y peculiaridades. Entenderlas en profundidad es un requisito para ser buen comunicador, para presentar gráficos con eficacia. Ser capaz de prever cómo tu audiencia va a procesar unos contenidos ayuda a no ser presa de las debilidades de la mente y a aprovechar al mismo tiempo sus capacidades innatas.

Esto se aplica no sólo a los gráficos, sino también al texto, a la forma de escribir noticias y reportajes. No hay diferencia alguna entre ambos en el sentido de que aprender cómo los ojos y el cerebro funcionan es una ventaja para cualquier profesional. La única forma de adquirir este conocimiento es a través de una sólida educación.

AC – ¿Por qué los errores a la hora de escribir, crear gráficos y diseñar presentaciones en PowerPoint son tan comunes, incluso entre aquellos que mejor conocen los entresijos de la mente?

SK– Porque existe una disociación entre nuestras intuiciones y nuestro conocimiento. En nuestro día a día, es común que nos dejemos guiar por intuiciones y que no siempre apliquemos lo que sabemos. El proceso creativo es casi automático, en él tienen más peso los prejuicios y las convenciones –que adquirimos a lo largo de la vida– que la razón. Es algo natural: si muchas de nuestras actividades cotidianas no fuesen automáticas e inconscientes, no seríamos capaces de sobrevivir.

Aplicar conocimientos al trabajo creativo requiere un gran esfuerzo, y dudo que podamos hacerlo mientras trabajamos en un texto o un gráfico. Por eso, en cualquier proyecto debe haber dos fases: la creativa, rápida, intuitiva, automática, en la que uno genera el producto, y lacrítica, en la que nos paramos, editamos, filtramos los contenidos y la forma de presentarlos. Esto dobla la cantidad de energía que necesitaremos invertir, y no todo el mundo está dispuesto a hacerlo porque confían demasiado en sus intuiciones. Creen, erróneamente, haber internalizado sus conocimientos hasta el punto de que estos se han vuelto automáticos.

Kosslyn5
(Foto: Jenn Chang)

AC – La visualización de información es un área en constante crecimiento. Segun la definición más habitual en sus textos fundacionales, se trata de una disciplina que tiene como objetivo crear presentaciones gráficas interactivas que «amplían la cognición«, las capacidades perceptivas y de comprensión. ¿Se trata de una metáfora o es cierto que, en cierto sentido, cuando usamos una visualización o leemos un gráfico, estos se convierten en extensiones de nuestra mente, de la misma forma en que un disco duro y un libro son extensiones de nuestra memoria?

SK– Para responder a esta pregunta es necesario que hagamos una distinción entre gráficos figurativos (depictive graphics) y gráficos simbólicos (symbolic graphics). Un gráfico figurativo es aquel que tiene una semejanza con lo que representa, como el plano de un apartamento, la explicación de cómo funciona un aparato, el mapa de una región, etc. En un gráfico simbólico, por el contrario, la relación con el fenómeno representado es formal: piense en los gráficos estadísticos.

Pues bien, la forma de leer e interpretar esos dos tipos de gráfico depende de cada persona. La lectura de los primeros es sencilla pero, para entender los segundos, uno debe aprender ciertas convenciones: que existen ejes X e Y, que la altura de las barras es proporcional a las cantidades que codifican, etc. Hoy en día, los gráficos estadísticos son muy comunes, por lo que pensamos que su lectura es natural, pero no lo es.

Maria Kozhevnikov, una científica de origen ruso, ha estudiado este problema. En varios artículos, ha mostrado que no todo el mundo entiende gráficos estadísticos con facilidad. Todo depende de los patrones de activación de ciertas regiones cerebrales, que varían dependiendo del individuo. En uno de sus estudios, Maria demostró que artistas, arquitectos y científicos interpretan gráficos de formas diferentes. Lo mismo sucede con los lectores comunes.

Por ejemplo, existe un grupo de personas para cuyos cerebros los gráficos, aunque sean simbólicos, representan objetos reales. Leen gráficos abstractos como si fuesen representaciones pictóricas de fenómenos reales, físicos, y acaban sumidos en la confusión.

AC –  Los gráficos estadísticos, simbólicos ¿son como la lengua escrita? Antes de ser capaces de leerlos ¿es necesario aprender su vocabulario, su gramática, su sintaxis? Eso es lo que sugieren libros como Reading in the Brain, del neurocientífico francés Stanislas Dehaene, que relacionan nuestra capacidad de lectura con la habilidad innata de extraer patrones visuales de lo que nos rodea…

SC- Exacto. Una buena analogía.

AC – ¿Durante la interpretación de gráficos simbólicos y estadísticos usamos las mismas áreas cerebrales en las que reside la capacidad de lectura de textos?

SC- Excelente pregunta. No creo que nadie se la haya hecho hasta hoy. Puede ser una interesante línea de investigación.

AC – En sus libros define ocho principios para la correcta presentación visual de contenidos, agrupados en tres categorías. La primera de ellas es «conozca a su audiencia»

SC- Sí. En esa categoría entran los dos principios más importantes, de los que depende el resto. El primero de ellos es el principio de «relevancia», que quiere decir que un gráfico debe contener sólo la cantidad de información necesaria para defender un argumento o contar una historia, ni más, ni menos. En realidad, el principio es aplicable tanto a gráficos como a textos: antes de comenzar a trabajar, uno debe plantearse qué es lo que quiere decir.

El segundo, el principio de «conocimiento apropiado», establece que debemos usar códigos que nuestra audiencia entienda de antemano. Es aceptable utilizar gráficos innovadores, pero siempre teniendo cuidado de incluir pistas y explicaciones para que el lector no se pierda. Ocurre algo parecido con el texto: no escribimos de la misma manera para audiencias especializadas que para un público amplio.

AC – Sin embargo, no todos los gráficos que vemos en los medios hoy en día tienen un mensaje concreto. Algunos de ellos (y estoy pensando en uno de The New York Times, con los principales datos del Censo), no plantean preguntas y luego las responden para los lectores, sino que cada lector tiene la libertad de navegarlos, interactuar con ellos, etc. De alguna forma, el usuario se transforma en editor. ¿Respetan este tipo de gráficos los principios de «relevancia» y de «conocimiento apropiado»?

SC- Todo depende de los objetivos. Esas herramientas son bases de datos, no gráficos propiamente dichos. Pero, incluso en ellos, el diseñador toma decisiones sobre lo que incluir y lo que no, y sobre cómo hacerlo. Así que ambos principios son aplicables: el diseñador debe tener una idea del tipo de cuestiones que los usuarios van a desear responder, y elegir los datos y programar la interfaz conforme a ellas. De todas maneras, intuyo que esos gráficos son tan abrumadores que la mayor parte de los lectores no les presta atención.

Kosslyn4

AC – Uno de sus intereses es la forma en que el cerebro genera y manipula imágenes mentales. Sin embargo, psicólogos y filósofos como Zenon PylyshynJerry Fodor rechazan la idea de que tengamos imágenes en la mente y defienden que nuestro pensamiento es totalmente proposicional, verbal, que razonamos usando solo palabras. ¿Por qué esta noción es tan polémica, cuando la experiencia de «ver» imágenes en la mente es común ? ¿Tiene que ver con el rechazo de tantos académicos y pensadores por lo visual en general, en favor de lo textual ?

SC- Los motivos para negar que muchos seres humanos experimentan ciertos patrones de activación neuronal como imágenes y que usan esas imágenes como herramientas para razonar son abundantes. Algunos de ellos tienen raíces históricas, que se remontan a John Locke y los empiristas.

Otros son más recientes. Hay quien piensa que las imágenes son una forma menos sofisticada de representación que el lenguaje. Basan esa idea equivocada en que los niños aprenden primero a dibujar y, más tarde, a escribir, así que deducen que manipulamos imágenes solo antes de conocer la forma «correcta» de razonar y comunicarnos, el lenguaje hablado y escrito. Es absurdo, por supuesto.

Alberto Cairo (Twitter: @albertocairo) es director de infografía y multimedia de la revista Época (Editora Globo, Brasil)

Hans Rosling representa el crecimiento de la población mundial con una técnica revolucionaria

En los próximos 50 años, la población mundial llegará a los 9.000 millones de habitantes. Sólo elevando el nivel de vida de los más pobres podremos controlar el crecimiento poblacional.

Una reflexión que Hans Rosling desarrolla mediante una colorida nueva tecnología de visualización de datos ;).

Digital Information R/evolution

This video explores the changes in the way we find, store, create, critique, and share information. This video was created as a conversation starter, and works especially well when brainstorming with people about the near future and the skills needed in order to harness, evaluate, and create information effectively.

[youtube http://www.youtube.com/watch?v=-4CV05HyAbM&w=480&h=390]

This video is licensed under a Creative Commons
Attribution-Noncommercial-Share Alike 3.0 License. So you are welcome to download it, share it, even change it, just as long as you give me some credit and you don’t sell it or use it to sell anything.