Archive for the ‘Data Analysis Software’ Category

SPSS or Excel?

Monday, May 26th, 2008

Why use a data analysis package like SPSS when you could use Excel? I’ve just come across an interesting marketing piece from SPSS that goes into benefits one gets from using a dedicated data analysis package instead of trying to do all of your analysis in a spreadsheet. While it would be fair to expect that this isn’t necessary an unbiased comparison, it might offer some food for thought to those of you trying to figure out why you should bother to upgrade.

Key reasons offered by “Discover Secrets Your Spreadsheet Can’t Tell You”:

  • Easy access to descriptive statistics and frequencies: True. while you can do descriptives in Excel using some of the built-in functions and the data analysis add-in, it is a lot easier and faster in SPSS.
  • Wider variety of charts & graphs: True, although I tend to find Excel easier to manage
  • Better, more flexible pivot tables  Sortof true. That is, true if you have SPSS Tables. If you don’t, then in my opinion Excel pivot tables are easier to work with. SPSS Tables, on the other hand, is extremely easy to use and lets you do a lot more things that you can’t do with Excel Pivot Tables.
  • Full set of statistical tests: True. While it is definitely possible to run statistical tests in Microsoft Excel, they’re much harder to find and work with compared to SPSS, where they pretty much come “free” with every function you run.
  • Easy to run similar reports and graphics for subsets: True. Using the “Split” function in SPSS, it is relatively easy to create tables and charts for subsets without doing any extra work. Or you can create syntax (SPSS’s macro language) that lets you reuse your tables and codes over and over again.
  • Labels instead of codes in your reports: I love this feature. Just because your survey software makes Male=1 and Female=2 doesn’t mean you want to see lots of 1s and 2s in your reports. And while it isn’t difficult to use search/replace in Excel to change all of your 1s to Male and your 2s to Female, SPSS lets you keep your values intact.
  • Accurate results when some data is missing: Sortof true. For this item, they point to the benefits you get from using the SPSS Missing Value Analysis add-on module (an extra $800 or so). This tells you whether the questions that were skipped by your respondents will impact your analysis, and will even estimate what these values should have been. Obviously Excel can’t do anything like that, but keep in mind you need to buy the extra module to get it to work.
  •  Helps you spot data-entry errors or unusual data points: Certainly SPSS can help with this one, but I think you can get these types of results pretty easily in Excel.
  • Easy import functions: I’m not sure that I completely agree with this one. It is true that it is easy to bring in text files. And they do provide functionality to bring in ODBC databases, including Excel spreadhsheets, Access tables and SPSS databases. But the interface for doing so is a little funky and the experience isn’t as clean or smooth as it is with Excel.
  • Unlimited rows: This point describes how SPSS can handle an unlimited number of rows while Excel can only handle 65,000. Microsoft Excel 2007 can handle unlimited rows too, but SPSS’s assertion may have been true when the article was published.
  • Using SPSS saves time and increases productivity: I suppose that really depends on what it is that you’re trying to do. There are a lot of analysis that I find easier to do in Excel. But certainly if you’re doing statistical analysis it is easier and faster in SPSS.
  • SPSS makes it easy to understand statistical results. SPSS has added a lot of extra help files and tutorials that explain how you can/should interpret a lot of the statistical jargon that the software spits out. Excel obviously does not.

A few reasons why I still do a lot of stuff in Excel:

  • For most people, the learning curve is much less steep with Excel: Learning SPSS was initially an unpleasant experience. It has a lot of options that don’t make sense until after you’ve spent a lot of time with the program. Once you’ve learned the software you’ll be amazed that you ever lived without it (or some other data analysis package) but until then you’ll spend a lot of time cursing it.
  • It’s expensive. Especially if you already have Excel. Expect to spend over $1,700 for a copy.
  • Charts are easier to manage/control in Excel: In my opinion. While SPSS has a lot of neat charting features, they aren’t as dynamic as Excel’s chart functionality — that is, when creating a presentation, I often need to go back in and tweak the charts or rearrange the data or rearrange the bars. In Excel, this is as easy as editing the underlying spreadsheet, which would automatically update the Powerpoint. In SPSS, you have to recreate the chart and recopy it into the presentation.
  • More flexible use of functions: Excel has a lot more functions than SPSS and gives you more flexibility in how you use them.

Read “Discover Secrets your Spreadsheet Can’t Tell You”

SPSS 16 for Mac Doesn’t Make the Cut

Saturday, March 29th, 2008

SPSS 16 for Mac Startup ScreenBertolt Meyer has written a not-so-happy review of SPSS for the Mac 16.0. His general thesis is that it is the “most insulting piece of software” he has ever come across. He felt that it didn’t look nor act like an Macintosh application; it isn’t properly internationalized; and more than a dozen bugs. (more…)

Review of SPSS Tables 16 (SPSS add-on)

Thursday, March 27th, 2008

SPSS Tables EditorLet’s say you’re a market researcher, you have an extra $1000 lying around, and you’re looking for an easier way to improve the look, feel and efficiency of your cross-tabs. What do you buy? If you’re me, you buy the the Tables add-on for SPSS. While the text below certainly isn’t a detailed tutorial on how to use SPSS Tables, it should give you an idea of the features it makes available to help you decide whether it is worth the money. (more…)

SPSS 17.0 Drops Support for PowerPC Macs

Sunday, March 23rd, 2008

SPSS announced recently to its Mac-based customers that SPSS 17.0 for Mac would not be released for the PowerPC based Mac, effectively discontinuing SPSS development for the PowerPC. PowerPC Mac users presently represent about 3% of all online computer users, down from 4.2% back in 2006 when the Intel based Mac was released. This based on the following letter which was recently e-mailed to SPSS for Mac users: (more…)

My Top 5 Free SPSS Help Web Sites

Sunday, March 23rd, 2008

I spend a lot of time working in SPSS, and occasionally I need answers about various techniques and methods that aren’t readily available in the included documentation. Fortunately, there is a tremendous amount of free SPSS information and training materials scattered all over the web. Here are a few of my favorites. (more…)

SPSS Text Analysis for Surveys Webcast

Monday, March 10th, 2008

Anderson Analytics and SPSS is offering a free webcast on March 20, 2008 at Noon EST in which Senior Consultant Jesse Chen will offer creative tips and tricks for analyzing unstructured (text) data. The webcast will last about an hour and will feature Chen using a variety of real-world case studies (probably integrating SPSS Text Analysis for Surveys).

SPSS 16 New Features

Friday, April 27th, 2007

In the upcoming SPSS Directions User Conference in Prague (May 16) Product Management Director Kyle Weeks will discuss some of the new features in SPSS 16. These include:

  • SPSS 16 has a new Java interface allowing for Windows, Mac, and Linux versions of SPSS, a searchable Output Viewer, resizable dialogs and more;
  • Improved data editor (adds find and replace capabilities to both variable view and data view). Also unicode support, import/export of Excel 2007 data, and an improved data editor;
  • Syntax to change string length and data types; ability to set a permanent default working directory; elimination of short/long string distinction; ability to suppress the number of active datasets.
  • More powerful statistics, including a new Neural Networks add-on module, a new Partial Least Squares algorithm, a new Cox Regression for Complex Samples module, support for algorithms written in R and improvements to Generalized Linear Models and General Estimating Equations;
  • Latent Class Analysis in Amos 16
  • SPSS 16.0 has improved programmability (see below)
  • More integration with SPSS Predictive Enterprise Services, allowing you to store/retrieve and query to/from the Predictive Enterprise Repository via both the user interface and syntax
  • Multi-threaded procedures for improved performance and scalability.

Other sources also report that SPSS 16.0 for Windows will use a new syntax editor. We can also assume that it will support Vista (since they still haven’t released a patch for SPSS 15).

SPSS has also indicated that some of the original functionality of SPSS Trends and SPSS Tables that has since been superceded by newer functionality will be eliminated. In SPSS Trends 16, the Exponential Smoothing, Autoregression, and ARIMA dialogs will be removed, while the more flexible Create Models; Apply Models; Seasonal Decomposition and Spectral Analysis dialogs will remain.

In SPSS Tables 16, Basic Tables, General Tables, Multiple Response Tables, and Tables of Frequencies will be removed, while the more flexible Custom Tables and Multiple Response Sets will remain.

It is worth noting that all of the functionality offered by the removed dialog boxes will continue to be available through syntax.

Details on the new Programmability of SPSS 16.0:

  • EXTENSION command for user procedures with SPSS syntax
  • Dataset features for complex data management
  • New dataset class extends Python transformation program capabilities to multiple datasets
  • Similar to INPUT PROGRAM but can read and write datasets
  • Multiple input and output datasets
  • User code written in Python
  • Ability to use R procedures within SPSS through R Plug-In
  • Provides ability to run R code within SPSS
  • Use to take advantage of statistical capabilities in R
  • Access active SPSS datset
  • Write results to SPSS Viewer
  • Improved implimentation of User Procedures
  • Can be written in Python but specified using SPSS traditional syntax
  • User never writes or sees Python code
  • Used as if a built-in SPSS command
  • Python module called with syntax already checked and processed by SPSS
  • More general PLS module
  • Dialog box interface tools coming in SPSS 17

I still wish it would be nice if they added the ability to organize variables in folders…maybe in SPSS 17?

Tim Macer reviews streamBASE GmbH’s Coding-Modul

Friday, April 27th, 2007

In the March 2007 issue of Research, Tim Macer reviewed streamBASE GmbH’s Coding-Modul, a program specifically designed to assist in the process of coding a significant number of open-ended questions. Tim gave the software a generally positive review (4 out of 5 for ease of use; 4.5 out of 5 for value). 

Tim liked the fact that Coding-Modul was a well-crafted system full of practical features for coding; that it allows you to easily distribute ‘packages’ of coding work to non-net connected individuals who are using standalone PCs; that it integrates seamlessly with Readsoft Forms; and that it has powerful administrative features to manage workflow.

Coding-Modul lost points because it is windows-based only; that automation features for typed texts are limited and that the documentation is not yet available in English (although it may be now).

SPSS vs. STATA

Friday, April 13th, 2007

Found an interesting comparison between the features of SPSS and STATA (two statistical analysis packages), as provided by several statisticians on Windows Live Spaces:

SPSS Advantages:

  • Slightly more user friendly in making complex tables & graphs
  • Nice routines for testing interactions in logistic regression models
  • Friendly ANOVA commands
  • Generally easier to use
  • Sophisticated survival analysis

STATA Advantages:

  • Much easier to run a probit
  • Much better documentation
  • Can do a lot more procedures than SPSS
  • Great company support, friendly user base
  • Multiple pooled cross sectional time series routines
  • Count procedures (poisson, negative binomial and zero routines)
  • Maximum likelihood estimators (Tobit, multinomial logit, ordinal logit, ordinal probit)
  • Huber-White correction for heteroskedascity
  • More comprehensive ANOVA routines
  • Cox regression
  • Duration analysis procedures
  • Capability to estimate models for complex surveys
  • Better weighting capability (pweights vs. aweights and iweights)
  • Ability to take clustering into account
  • Lots of user written solutions
  • Much better handling of longitudinal panel data
  • Event history analysis capabilities
  • Panel data analysis capabilities
  • Faster development than SPSS
  • Better leasing arrangement

What this all means to market researchers I cannot say — I generally in my day to day life do not use many of the statistical procedures they describe and I’ve never tried STATA.

Statistical Analysis with R

Saturday, March 24th, 2007

University of Missouri graduate student Mitch Hardin recently posted a note on his blog about how after spending a lot of quality time with SPSS he switched to R, a free software environment for statistical computing and graphics that runs on a variety of platforms (Windows, MacOS, Unix).

Although R is “almost entirely command-line driven,” Mitch likes the fact that it offers more information about what is going on and there are a lot of user-defined function. Personally, I can’t imagine using a command-line to do my stats processing, but then I’m not a big one for getting things done in SPSS syntax either. I couldn’t find much evidence of people using R for marketing research, but the software is free and I can’t imagine why it wouldn’t work if you needed a powerful statistical package but didn’t want to spend a lot of money.

SPSS 15 Doesn’t Work with Microsoft Vista?

Thursday, March 22nd, 2007

I came across a post on comp.soft-sys.stat.spss in which the purchaser of a new Dell system with Vista Home Premium edition was unable to install SPSS 13, 14, or 15. They indicate that they spoke to SPSS support, who told them that none of the versions which require activation will install on Vista.

In a follow up message, it was noted that SPSS was working on a patch and that the estimated date of release is the end of April, 2007.

Followup: SPSS has released a hotfix to address the problem. It can be downloaded from the SPSS Support Website (login required — you can use user: guest/password: guest). In addition to the hotfix, the page also identifies the procedure for installing SPSS 15 on Visita.

Tim Macer reviews GMI Research Analyzer 2.0

Tuesday, February 27th, 2007

Tim Macer recently reviewed GMI’s Research Analyzer in the February 2007 issue of Research Magazine. Research Analyzer is a fairly easy to use package for analyzing your data and creating reports without the usual hassle that comes from a more "statistical" program like SPSS. Basically, take everything you would do to analyze your data in Excel or SPSS and develop a program that is specifically designed to streamline the process and you get GMI Analyzer. Tim gave the program a 3.5 out of 5 for ease of use; a 4.5 out of 5 for cross-platform compatability; and a 4 out of 5 for value for money.

Tim liked the fact that GMI Research Analyzer is easy to master without taking a class; it has an intuitive drag-and-drop interface; it neatly combines online data serving with offline convenience; and it has a serious range of analytical capabilities. He didn’t like the fact that controlling the look and feel of the output could be difficult; that it dosn’t support the making of blanket changes to tables and charts that are already set up; and he didn’t think there was enough documentation.

Compare GMI Research Analyzer to SPSS Desktop Reporter and web-based MarketSight.

SPSS Desktop Reporter

Thursday, December 21st, 2006

Not to go too much into detail yet, but we’ve been spending some time with SPSS Desktop Reporter 4.0. Wow. Despite the price, it’s a great software package with a lot of extremely useful features and worth every penny.

What it allows you to do is to take an SPSS file (or any variety of other data files), create crosstab tables and charts (doing all sorts of fancy analysis along the way) and then save structure of what you’ve created so that you can then rerun all of the tables and charts against an expanded dataset later on. Makes it great for surveys that feature multiple waves. But it’s also great for ad-hocs. If you’re a fan of the SPSS Tables module you’ll absolutely love SPSS Destop Reporter, which is kind of like what you’d get if you built an entire application around SPSS Tables.

Like most SPSS products, you definitely pay for what you get and you’re probably looking at spending around $2,800 for one copy (the price goes down for additional copies). If you have the budget available, it’s definitely worth taking a look at.

Learn more at the SPSS web site.

Affordable Data Tabbing Software

Wednesday, December 6th, 2006

I recently received a request for information about the available data tabbing (cross-tab, table generating) software that is available out there and thought I might share what I know. Personally I was a little surprised that there aren’t more software packages out there that do this kind of work — the ones that are available are fairly expensive.

The idea of this type of software is that you can take a data set (perhaps in SPSS format or some other standard or not so standard format) and almost automatically generate tables for each of the questions in the dataset. Usually each table would feature a number of columns representing different market segments (the "banner") while the various responses to each question would be shown in each row.

Such software is usually capable of automatically doing various statistical tests, sometimes including but not necessarily limited to t-tests, z-test, ANOVA and Chi-Square. Some programs can also weight and sort the data, as well as merge data sets, provide sample balancing, generate charts to go with each table, and provide output into a number of different formats ranging from text files to Excel spreadsheets to Powerpoint templates.

WinCross is perhaps the most well known of the bunch, and has been around a long time (it is presently at version 7). It is entirely compatible with SPSS 15.0, allows for 32,000 data columns and up to 3,000 rows per table; offers a number of different statistical tests; and can important export in a number of formats. My biggest complaint is that the user interface seems like it was developed about 10 years ago, and that it costs $1,995 per copy (the more copies you buy, the less it costs). Learn more at the WinCross web site.

Version 2 of GMI Research Analyzer was recently released by Global Market Insite. It will also take your data set, crosstab it, create reports, and export it into a number of different formats — but it also allows you to do drag-and-drop analysis of you data and has a much friendlier interface. It can be purchased from the GMI web site for $995 per copy.

MarketSight is a completely online tool for analyzing your data and is sold as a subscription service for $995 per year. It offers much of the same data analysis and reporting capabilities as the WinCross or GMI tool, and it has the added advantage of being accessible from anywhere you havea web connection. There are some limits to the size of the files you can upload yourself (only 50,000 records) but if you’re looking for a fairly easy to use program that doesn’t require a lot of effort to learn, visit the MarketSight web site.

SPSS Desktop Reporter was recently released as an integrated member of the SPSS Dimensions suite of products. Based on my limited experience with the program it appears to be a very user-friendly tool that can be used independently of the other Dimensions products. If you’ve ever used mrTables — well, this is like mrTables, except that you have all of the data available on your desktop and there are many more options. SPSS Desktop Reporter has all of the features of the other software packages, plus especially easy integration with other SPSS products. It sells for about $2,800 per copy. More information is available at SPSS.com.

If you are on a "budget" (and I use that term loosely) and already have SPSS, you may be able to get SPSS Tables (a module available for SPSS) to do many of the things you need, albeit with the requirement that you do a lot of it manually. It sells for only $800 a copy and can be found on the SPSS web site.

Another package worth considering if you are on a budget is Memphis Market Intelligence’s Survey Explorer. Different versions of the software are available which offer the ability to work with differently sized datasets — at different prices. So if you are dealing with a medium sized data set (50 questions, with no more than 2500 respondents) you could get the "Personal" edition of the software for only $519, the "Professional" edition for $579, or the "Enterprise" edition for $749. There most advanced version of the software allows for an unlimited number of questions, an unlimited number of waves and up to 20,000 respondents and can be purchased for $1,439. Although I haven’t had a chance to play with it, it appears to offer many of the features of the more expensive packages and is worth looking into at the Memphis Software web site.

If there are any other packages that you would recommend, please don’t hesitate to send them to me.

Overview of SPSS Dimensions

Sunday, October 22nd, 2006

SPSS recently announced the release of SPSS Dimensions 4.0, the latest incarnation of its enterprise survey and analysis suite that does everything from helping you create surveys to analyzing the data to generating reports. Before looking into the new features introduced in version 4.0, I thought it might first be interesting to explore the basic features of the system. In other words, what is SPSS Dimensions?

SPSS Dimensions isn’t so much an individual software package as much as it is a platform of several independent software packages that are able to work together in a relatively seamless fashion. Sort of like how each of the programs within Microsoft Office (Word, Excel, Powerpoint, Access, etc) can work independently (and be purchased independently) but also work very well together. Like Office, most of the packages that work with Dimensions are published by SPSS — although the platform has been designed to accommodate the integration of software written by 3rd party developers (and several such packages do exist).

Some of the programs that work with Dimensions include:

  • SPSS mrDialer - Automated dialing for phone surveys
  • SPSS mrInterview - Create and execute online surveys
  • SPSS mrInterview CATI - Create and execute phone surveys
  • SPSS mrPaper - Create and execute paper surveys
  • SPSS mrScan - Scan paper surveys
  • SPSS mrStudio - Manage and manipulate data
  • SPSS Desktop Reporter - Create tables from local data
  • SPSS mrTables - Interact with tables on your desktop
  • SPSS mrTranslate - Manage translations of surveys and reports
  • Techneos Entryware - Collect data using handheld devices
  • SPSS Base - Analyze data
  • Clementine - Data mining
  • SPSS Text Analysis for Surveys - text analysis & categorization

At the core of SPSS Dimensions is the Dimensions Data Model, a set of components (openly documented and supported) which allow for accessing information about questionnaires and respondent data. It also deals with keeping track of changes to the questionnaire (versioning), translating both questions and data from one format to another, and managing data stored in multiple formats and platforms.


Visual representation of the Dimensions Data Model
(from the SPSS presentation "Using the SPSS MR Data Model")

The table above describes the role of the data model well. Data can be collected from multiple sources. Instead of each collection program storing the data in its own database, it instead sends it to the Dimensions Data Model which puts it into its own special format. When another program, such as a data processing program or a data analysis program needs the data it requests it from the Dimensions Data Model using standardized request formats (that just about any program can use).

Consider a project in which you need to collect data using three different survey techniques including a phone survey, a web-based survey and a paper survey. Even though you’re going to ask the same basic questions in each survey, you are still going to have to develop three completely different questionnaires in order to compliment each of the mediums, which further means you’re going to have to program the survey three different times (perhaps four, if you consider that you may be using scanning software to read some of your paper surveys). 


Even though the question is the same, it needs to appear different
across modes and across functions (from the SPSS presentation
"Improving Government Programs with Comprehensive Data Collection")

After you’ve finished collecting the data (using three separate data collection tools, all of which store the data in their own, separate proprietary format that exports into the frustratingly simple CSV format, you’ll then have to combine all of the data into one file which you’ll then need to clean and prepare for analysis. Following analysis, you’ll export your results into yet another program.

Using the SPSS Dimensions Suite (or more specifically, software that is integrated into the Dimensions Suite) makes the process go much faster by optimizing the mechanics of designing and fielding your questionnaire and analyzing and reporting on the data.


Dimensions reduces the time it takes to conduct a complex research project
(Source: SPSS presentation "Discover it with Dimensions")

SPSS Dimensions has been developed based on the notion of "Design Once, use Many" so once you have created your initial questionnaire (either using a simple, graphic user interface found in mrInterview or the more advanced script driven interface provided by mrStudio — either package will allow you to import the text of your survey from MS Word), you can then quickly (and easily) set it up to deploy using multiple modes (paper, web, CATI, etc).

Perhaps one of the most exciting features of SPSS Dimensions is its multi mode deployment capabilities. Most surveys today require some amount of programming to deal with skipping, piping, the incorporation of outside data, and other advanced options. Ordinarily, each mode would require its own programming. SPSS Dimensions is designed so that you only need to write the script once and it will work the same in each context.


SPSS Dimensions allows you to program your survey once
and have it work on multiple platforms. (from the SPSS presentation
"Improving Government Programs with Comprehensive Data Collection")

Dimensions not only helps you design and execute your survey, it also manages security, translating the survey into multiple languages, and manages multiple versions of your survey as well.

External databases containing participant details can be added at any time, and it can be used both in the survey and during analysis. Data from outside sources can be reviewed during the survey to check for inaccuracies, and it can even be updated based on responses given in the survey.

All of the data that is collected, regardless of how it is collected, goes into one SPSS Dimensions database where it can then be analyzed and reported on. Although Dimensions is an open platform that will allow analysis to be conducted in any program (it will export data into a variety of formats for other programs to use), the suite is optimized to work with several SPSS-published programs, such as mrStudio, mrTables, SPSS for Windows and Clementine. Results can then be automatically turned into interactive web-based reports or analyzed using Excel, Word or PowerPoint. Dimensions integrates all of the major capabilities provided by SPSS’s various data analysis packages, including SPSS Base for statistical analysis; Clementine for data mining, and SPSS Text Analysis for Surveys for text analysis and categorization as well as a variety of SPSS and 3rd party data collection and reporting tools.

Reasons to Consider SPSS Dimensions

  • Powerful interviewing engine
  • Open architecture
  • Web-based user interface
  • Easy to create surveys
  • Write the survey once, use in multiple modes (phone, web, etc)
  • Write the scripting/programming once
  • Write the survey once, use in multiple modes (phone, web, etc)
  • Write the scripting/programming once
  • Easy to program (similar to VB Script)
  • Common data storage format/interface
  • Translation capabilities
  • Faster development and analysis time
  • Works with (some) third party applications
  • Scriptable (write your own scripts to work with data)
  • Integrates well with SPSS
  • Integrates well with Excel

Reasons not to use SPSS Dimensions

  • Expensive
  • Limited to Dimensions compatible tools
  • Complicated to set up and integrate with existing systems
  • Requires lots of IT support

Learn more at the SPSS web site.