Analysis Variables Archives | SnapSurveys Support documentation for Snap Surveys products Thu, 11 Jul 2024 08:56:53 +0000 en-GB hourly 1 https://wordpress.org/?v=6.4.5 https://www.snapsurveys.com/support-snapxmp/wp-content/uploads/2020/07/favicon-32x32-1.png Analysis Variables Archives | SnapSurveys 32 32 RIM weighting https://www.snapsurveys.com/support-snapxmp/snapxmp/rim-weighting/ Thu, 30 Jun 2022 16:04:38 +0000 https://www.snapsurveys.com/support-snapxmp/?post_type=epkb_post_type_1&p=7827 RIM weighting is used when you wish to provide weighting for more than one variable to achieve an even distribution of results across an entire dataset. It can also be used to produce an analysis in which the proportion of respondents in your sample is adjusted to match more closely to the proportion in the […]

The post RIM weighting appeared first on SnapSurveys.

]]>
RIM weighting is used when you wish to provide weighting for more than one variable to achieve an even distribution of results across an entire dataset.

It can also be used to produce an analysis in which the proportion of respondents in your sample is adjusted to match more closely to the proportion in the target population.

For example, if you wished to weight your samples so that they were 50% male and 50% female, and also 20% in each of five age brackets, the algorithm would calculate the correct weighting that needed to be applied to each table entry (combining age and gender).

RIM weighting works best for single response variables where there is no missing data, and the counts or percentages are similar to the existing data responses.

It is not a good idea to use rim-weighting if:

  • Your variables are related to each other (for example, income bracket and dwelling size).
  • The values vary enormously (for example, you have 96 males and four females, and you are attempting to balance it to 50:50).
  • You are applying a very large number of weights.

Creating a RIM Weight

  1. In the Survey Overview window, open the survey.
  2. Click Analysis Variables on the Snap XMP Desktop toolbar. This opens the Analysis Variables window which displays a list of the analysis variables.
  3. Click New Analysis Variables Item on the Analysis Variables toolbar. This displays a menu of analysis variables to choose from.
  4. Click New RIM Weight. This opens the RIM Weight window. Note there is an initial error in the status bar as there are no variables references yet.
Graphical user interface, text, application, email

Description automatically generated
  1. In Name, enter a name which describes the RIM weight.
  2. In Label, enter a description of the RIM weight.
  3. In Target total, select the required option.
  4. In Missing data, there are two choices: exclude partial cases or include partial cases. Excluding the partial cases provides the most accurate result as the partial cases have missing data in their responses. The default is exclude partial cases.
  5. In Filter, enter a filter expression to create the RIM weight for a subset of the response data.
  6. Click Add Variable to open the Select variable dialog.
Table

Description automatically generated
  1. Select a variable from the Name drop-down as the weighting variable.
  1. Click OK. This displays default values for the RIM weight in the grid . The grid shows the ratio, expected count and percentage as well as the actual count and percentage.
  2. Click in the Ratio column to change the ratio or target number depending on the Target total option selected.
  3. Repeat for all the variables you wish to add.
  4. If you wish to remove variables, select a variable in the grid then click Delete Variable to delete the variable.
  5. Click Save to save the RIM weight.

Target Totals

There are three options for setting the Target total: Valid cases, From targets and Custom.

Valid cases

Valid cases bases the RIM weight on the valid data responses in the survey. The default ratio is 1 to give an equal distribution for all variable codes. The Ratio column can be changed with a proportion or target number for each variable code. For a target number to be used the total numbers in the Ratio column (per variable) must add up to the number of valid cases, otherwise a proportional ratio is used. The Expected column shows the target counts. The total number of valid cases is available next to the drop-down.

RIM weight with Valid cases selected and an equal ratio distribution

From targets

From targets sets a ratio or target number for each code within a variable. This is set in the Ratio column. The target number is available next to the drop down and is the count of the target number for each variable code in the Ratio column. The Expected column shows the target counts for each variable code.

RIM weight with From targets selected and target numbers set for each code

Custom

Custom allows an overall target number to be entered. Selecting this option enables the field next to the drop-down where you can enter the target number. The default ratio is 1 which shows an equal distribution of the target number split across all the codes within each variable. The target number can be edited in the Ratio column. The Expected column shows the target counts for each variable code.

RIM weight with Custom selected and target numbers set for each code.

Note: The RIM weighting in Snap XMP Desktop works out whether the number entered in Ratio is a percentage, proportion or target number and does not require a percentage sign (%) to be entered.

Tailor the RIM Weight

You can customise the decimal places, maximum iterations and match threshold for the RIM weighting.

  1. Click Tailor on the RIM Weight toolbar where you can edit the number of decimal places, maximum iterations and match threshold for the RIM weighting calculation.
  1. Click OK to save.

Assess the RIM Weight

Assessing the RIM Weight gives statistics of how efficiently the RIM weight will meet the target.

  1. Click Assess RIM Weight on the RIM Weight toolbar to assess the RIM weight.
  1. A summary shows the build information. Click OK to close the summary.

Errors

If there is an error in the RIM weight the status bar will show an error message. Clicking Assess RIM Weight also shows an error message. For example, when not enough target values are set, the assessment shows an error message and the status also displays this message.

Build the RIM Weight

The status of the RIM Weight displays in the status bar at the bottom of the RIM Weight window. When the RIM Weight is created the status shows as Not built.

  1. Click Build RIM Weight on the RIM Weight toolbar to build the RIM weight.
  2. The Status changes to show Built. If there is an error in the RIM weight the status bar will show an error message and the RIM weight is not built.
  3. Click Save to save the RIM weight.

Using the RIM Weight

The RIM weight can be included when creating analyses.

  1. Click the required analysis icon on the Snap XMP Desktop toolbar. This opens the Analysis Definition window.
  2. In Analysis and Break, enter appropriate break and analysis fields.
  3. In Weight, enter the name of the RIM weight.
Graphical user interface, application

Description automatically generated
  1. Click OK to build the rim-weighted analysis.

The post RIM weighting appeared first on SnapSurveys.

]]>
Introduction to the analysis variables window https://www.snapsurveys.com/support-snapxmp/snapxmp/introduction-to-analysis-variables-window/ Wed, 30 Jun 2021 09:06:20 +0000 https://www.snapsurveys.com/support-snapxmp/?post_type=epkb_post_type_1&p=6029 The Analysis Variables window displays a summary of all the analysis variables including auto category variables, factor analyses, cluster analyses and groups that have been created in the survey. Analysis variables are derived from other variables, according to criteria you specify. Auto category variables These categorise open response variables into codes. You can then analyse the resulting codes […]

The post Introduction to the analysis variables window appeared first on SnapSurveys.

]]>
The Analysis Variables window displays a summary of all the analysis variables including auto category variables, factor analyses, cluster analyses and groups that have been created in the survey. Analysis variables are derived from other variables, according to criteria you specify.

AnalysisVariables.PNG
  • Auto category variables
    • These categorise open response variables into codes. You can then analyse the resulting codes and display them as clouds, frequency tables etc.
  • Group variables
    • Group variables allow you to treat a group of similar variables as a single variable: for example grid questions. The variables must share a code list.
  • Factor analyses
    • Factor analyses summarises the responses to several variables as a composite variable. It allows you to look at underlying patterns in data.
  • Cluster analyses
    • Cluster analyses analyse factor analyses, and create variables based on how the responses are clustered together.

Button

Menu Option

Description

NewSurveyIcon.png

Edit | New

Create new auto category variablegroup variablefactor analysis or cluster analysis

CloneSurveyIcon.png

Edit | Clone

Clone variable or analysis and show details.

DeleteSurveyIcon.png

Edit | Delete

Confirm then delete selected item

VariablePropsIcon.png

Edit | Modify

Show details of selected item.

SourceDependIcon.PNG

View | Sources and Dependents

Show the Sources/Dependencies dialog.

TailoringIcon.PNG

Tailor | Variables

Set the default options for the variables.

PrintIcon.PNG

File | Print Report

Print the analysis variables report

CopyIcon.png

Edit | Copy

Copy selection information on the Clipboard.

PasteIcon.png

Edit | Paste

Create a group or analysis from information on the Clipboard.

The post Introduction to the analysis variables window appeared first on SnapSurveys.

]]>
Introduction to cluster analysis https://www.snapsurveys.com/support-snapxmp/snapxmp/introduction-cluster-analysis/ Wed, 30 Jun 2021 09:02:26 +0000 https://www.snapsurveys.com/support-snapxmp/?post_type=epkb_post_type_1&p=6055 Cluster analysis is used as a method of segmenting the market on a combination of variables rather than the usual straightforward segmentation variables such as age, gender, location, etc. It is most effective when used with quantity variables or failing that, single-response variables with relatively many possible codes. All of the source data should be […]

The post Introduction to cluster analysis appeared first on SnapSurveys.

]]>
Cluster analysis is used as a method of segmenting the market on a combination of variables rather than the usual straightforward segmentation variables such as age, gender, location, etc. It is most effective when used with quantity variables or failing that, single-response variables with relatively many possible codes.

All of the source data should be at least ordinal by nature, and ideally of interval or ratio type. That is, if a single response variable is used, it should be one where the codes show an ordered increase or decrease in response. Thus, an ordered age variable would be acceptable but a gender or geographical region variable would not.

Cluster analysis is an exploratory technique designed to identify patterns in data that may not be immediately obvious. Its object is to sort cases into groups, known as clusters, so that the members of a particular cluster are similar to each other but members of different clusters are dissimilar.

It is not a classification technique as it makes no assumptions about the nature of the groups or clusters prior to the analysis being carried out. The groups are constructed based on the data cases provided with each case being assigned to the cluster that it is most like, and each cluster being defined by the characteristics of its members.

The algorithm for the Cluster Analysis used in Snap XMP Desktop, known as k-means clustering, is as follows.

  1. The user specifies how many distinct clusters are required and which variables are to be used in the analysis.
  2. Each cluster is then assigned a value for each variable. Typically, this will be done arbitrarily taking into account the range of values for each variable. As an example, if only two variables are specified, they could be plotted on a two-dimensional scatter graph with the cluster centres represented by points on the graph.
  3. Having assigned the initial cluster centres, Snap then considers each case in turn and calculates which cluster centre it is closest to. The case is then assigned to that particular cluster.
  4. Once all cases have been considered, and allocated membership of one of the clusters, the cluster centres are recalculated as the mean value of all the members of that cluster.
  5. A consequence of recalculating the cluster centres is that some of the cases may now be in the wrong cluster. That is, the centre of the cluster of which it is a member may have moved further away from them, while the centre of a nearby cluster may have moved closer.
  6. Snap repeats the previous step, assigning each case to the cluster whose centre is closest, until convergence is reached.

Initially there is likely to be considerable movement between clusters, however convergence is quickly achieved in most cases. Typically, successive iterations will generally see fewer cases move from one cluster to another, meaning that the cluster centres do not change so much and hence there will be less movement in the next iteration.

Standardised values

Snap uses Standardised Data Values to perform the Cluster Analysis and allows the user to see the standardised values in the results if desired. The Standardised Data Values are calculated by applying a transformation to the initial data set, creating a set of values with a mean of 0 and a standard deviation of 1.

This is an essential process as the source variables may have very different orders of magnitude. For example, consider two quantity variables, Age and Salary, included in the source data. It is probable that the values for Salary will of a different order of magnitude (tens of thousands) from that of Age (tens). If the Cluster Analysis used the actual data values, differences in salary would be given considerably more importance that differences in age. The resulting clusters would be determined predominantly by differences in salary. Standardising the data sets creates a “level playing field” so that all source variables are compared on equal terms.

When the results are reported, it is often unhelpful to use the standardised values as these have no units and are therefore have limited use for interpretation. Snap XMP Desktop by default reports the actual data values.

Snap measures the distance between cases and cluster centres using the Euclidean method. That is, the straight line distance between the two points on a graph.

Standardised Data Values are calculated by subtracting the mean value of the entire data set and dividing the result by the standard deviation.

Running means

The specification for Cluster Analysis includes an option to use Running Means. By default, this is not selected. If the Running Means option is switched on, then the calculation of the cluster centres takes place every time a data case is allocated to a new cluster, rather than waiting until all cases have been evaluated.

Creating a cluster analysis

  1. Open the survey.
  2. Click the AnalysisVariablesIcon.png button on the toolbar to open the Analysis variables overview window. The overview window shows all the Group and Auto-category variables and Factor and Cluster analyses currently set up in the current survey.
  3. Click the NewSurveyIcon.png button and select New Cluster Analysis… from the menu. The Cluster Analysis Details window appears.
ClusterAnalysis1.PNG
  1. Specify a name and descriptive label as required.
  2. Specify the list of variables from which clusters are to be extrapolated in the Source field. For example, specify Q5, Q2 to have clusters evaluated for those two variables. Use range definitions if the variables fall into a consecutive range, for example Q6a to Q6e would include all variables between and including Q6a and Q6e.

Once the source variables have been specified, the clusters will be determined and results may be reviewed or further qualified by clicking on the appropriate tab.

Initial centres

If clusters are not clearly defined, the initial centres selected may have an effect on the clusters produced. As an extreme example, if you have data that naturally looks like two clusters and you ask for three clusters, the third cluster does not naturally have a suitable centre, so its final position will be affected by its starting position.

Look at the centres and the scatter plot to see if you think the clusters are well spaced for the data – the F-values help with this too.

The Initial centre options are:

Zero (default) – start at zero, so the clusters move away from zero sequentially.

First cases – the n cluster centres are set to the first n cases. This provides some real positions as initial centres. It is rather open to influence from what people have answered.

Evenly spread – for each source variable, find the minimum and maximum values. The first cluster centre starts at the minimum values and the last cluster starts at the maximum values. The other clusters are evenly spaced in a ‘line’ between these extremes. This will favour results ranging from generally good to generally poor

Examining a cluster analysis

The Cluster Analysis dialog allows you to examine the defined clusters in different views by selecting different tabs.

  1. Create a new Cluster Analysis based on the Crocodile Rock Cafe survey with
    • Name: CL1
    • Label: Cluster Analysis CL1
    • Clusters: 2
    • Source: Q5, Q2, Q8
ClusterAnalysis2.PNG
  1. Click on the Results tab to show details of the cluster centres for each variable and the count of respondents in each cluster.
ClusterAnalysis3.PNG
  1. By default, actual values will be shown (as indicated by the Show setting Actual Values. Under this setting, values for quantity variables will reflect the actual answers; values for categorical (single response) variables will reflect the code values.
  2. Change the Show setting to Code Labels to show code labels for categorical variables. Results for quantity variables will still reflect the actual values of those variables.
ClusterAnalysis4.PNG
  1. Change the Show setting to Standardised to show standardised results for all variables.
ClusterAnalysis5.PNG
  1. Use whichever of the Show settings is appropriate for determining a description for each of the clusters. A descriptive label can be allocated to each cluster in either the Results tab or the Setup (previous) tab.
  2. Click the Centre Distances tab to see the table of cluster centre distances. The distances are shown between clusters either as Actual Values (with the Show setting as either Actual values or Code Labels) or Standardised Values.
ClusterAnalysis6.PNG
  1. To see how cluster centres move during the iterative calculation process, click on the Iteration drop-down. The default setting, Final, shows the result at the end of the last iteration.
  2. For an alternative view of the movement of cluster centres, click the Iteration History tab to show the change in each centre during the iterative calculation process.
ClusterAnalysis7.PNG
  1. Click on the Anova tab to show the results of the Analysis of Variance for the current cluster solution. The Mean Square values show the average (mean) squared distance between each of the cluster centres (Between Clusters) and between each case and the centre of the cluster to which it belongs (Within Clusters). The F-value is a statistical measure of how distinct the cluster groups are; a high F-value will indicate highly distinct sub-groups.
ClusterAnalysis8.PNG
  1. The F-Values tab shows a summary of the F-values for several different cluster solutions, with the current solution highlighted. Generally speaking, high F-values indicate that the members of each cluster group are homogeneous, and that the cluster groups are highly distinct from one another.
ClusterAnalysis9.PNG
  1. The Scatter Plot tab shows a plot of data case locations and cluster centres. Each cluster centre is automatically allocated a unique colour. The points representing each case are coloured to indicate the corresponding cluster they have been allocated to. The plot is of one variable against another. If there are more than two variables in the source, drop-down boxes enable you to specify which two should be plotted.
ClusterAnalysis10.PNG

Although F-Values can be used as an indicator of how many cluster groups should be specified, it is not advisable to rely exclusively on this measure. The solution with the highest F-Value will not necessarily be the ideal solution, and users should rely on their own knowledge of the customer base.

The post Introduction to cluster analysis appeared first on SnapSurveys.

]]>
Introduction to factor analysis https://www.snapsurveys.com/support-snapxmp/snapxmp/introduction-factor-analysis/ Wed, 30 Jun 2021 09:01:27 +0000 https://www.snapsurveys.com/support-snapxmp/?post_type=epkb_post_type_1&p=6041 Factor Analysis is a data reduction technique that looks at responses to several variables and summarises them into composite variables, known as factors that make analysing the data a more manageable task. Also called Principal Components Analysis, its main use is in identifying the underlying patterns in the way customers have responded to a series […]

The post Introduction to factor analysis appeared first on SnapSurveys.

]]>
Factor Analysis is a data reduction technique that looks at responses to several variables and summarises them into composite variables, known as factors that make analysing the data a more manageable task. Also called Principal Components Analysis, its main use is in identifying the underlying patterns in the way customers have responded to a series of questions.

Typically, a survey might contain a series of questions asking respondents to express an opinion on different aspects of the product or service being evaluated. There may be dozens of questions that all require a response using, for example, a 7-point rating scale. Spotting trends in such a long list of questions can be difficult, so Factor Analysis is used to reduce the list to one of a more manageable length.

In Snap XMP Desktop, the Factor Analysis technique looks at correlations between each pair of questions and combines variables that have a high correlation with each other. These groups of variables are then combined in a particular way to form the factors. As the resultant factors take into account responses to several different source variables, the original list of variables can be reduced to a more manageable number, with each factor as a form of derived variable.

Typically each factor produced by the analysis will be heavily based on a subset of variables that are in some way similar. Since the resulting factors are stored within Snap and can be used as variables in subsequent analysis, knowing the source variables that have influenced the factor most heavily helps when giving the factors meaningful names.

Since the purpose of Factor Analysis is to reduce the number of variables to a more manageable level, it is likely that only a small number of the factors will be retained. The factors are listed in decreasing order of importance, allowing you to choose how many to retain for further analysis. The first factor will be the one that explains the highest amount of the total variance within the data (for the variables used in the source). The second factor will be the one that explains the highest amount of the remaining variance, and so on. The number of factors to be retained will depend on how many source variables are used, and the data for those source variables. This decision is, to some degree, arbitrary. There are theories that provide guidance on how many factors to take, such as ignoring factors with an Eigenvalue below a certain threshold, or taking sufficient factors to have a cumulative variability proportion of greater than a prescribed level.

An important application of Factor Analysis is as a precursor to Cluster Analysis.

Creating a factor analysis

  1. Open the survey.
  2. Click the AnalysisVariablesIcon.png button on the toolbar to open the Analysis variables window. The overview window shows all the Group variables, Auto-category variables, Factor analyses and Cluster analyses set up in the current survey.
AnalysisVariables.PNG
  1. Click the NewSurveyIcon.png button and select New Factor Analysis from the menu. This opens the Factor Analysis Details window.
FactorAnalysis1.PNG
  1. Provide a suitable name and descriptive label of the factor analysis.
  2. In the Source field specify the list of variables for which factors are to be derived. Use range definitions if the variables fall into a consecutive range, for example Q6a to Q6e includes all variables between and including Q6a and Q6e. Specify Q6a to Q6e in the Source field. Separate variables with commas if they do not fall into a consecutive range, e.g. specifying Q6a, Q6b, Q6d.
  3. Once the source variables have been specified, click in the table below and the factors will be calculated. Snap will derive the same number of factors as there are input (source) variables.
FactorAnalysis2.PNG

By default, Snap XMP Desktop uses the Jacobi algorithm for calculating factors. Selecting the Varimax option is an extra step which can make it easier to interpret the factors produced.

Understanding the factor analysis table

FactorAnalysis2.PNG

The results table shows for each factor:

Label

An editable text description of the factor. Choose a useful name by inspecting the factor loadings (see below)..

Eigenvalue

Measure of the weight (or importance) of the factor in representing the variables given as sources. The sum of all eigenvalues is the same as the number of input variables, so any factor with an eigenvalue greater than 1.0 can be thought of as being better than average.

Factors are always shown arranged in decreasing order of eigenvalues.

Proportion

How much of the variability of the data is explained by each factor. It is calculated as the eigenvalue divided by the number of factors/variables and is equivalent to the percentage variability in the data represented by that factor

If one factor has a very high proportion, i.e. more than 80%, and the rest are all very low, it is possible that the questions have not covered all aspects of customers’ attitudes.

Cumulative Proportion

A running total of the previous column (Proportion).Since factors are arranged in order of decreasing eigenvalues, the cumulative proportion represents the percentage variability in the data represented by the specified and all preceding factors.

Factor Loadings

How much a particular variable contributes to the factor. Large loadings show that the variable is relatively important; smaller values indicate that it has less influence. These loadings will help you provide a suitable name for each factor.

Arranging factor loadings tables for easy interpretation

Transposing the table

To interpret the factor loadings, it is often easier to transpose the table. Select Transpose to display each factor as a column.

Re-ordering the table

The variable order is set by how the analysis was specified. You can sort the order by importance within each factor. This helps you see which variables have the greatest effect on a factor.

Click in the grey box containing a number at the top of the column or the end of the row containing the factor data. A triangle appears representing the sort order, and the variables re-order by the factor loading for that factor.

Column

Row

 

Down pointing triangle for table sort

Right triangle for table sort

Greatest first

Up pointing trangle for table sort

Left-pointing traingle for table sort

Smallest first

Blank grey square representing absent triangle

Blank grey square representing absent triangle

(blank) Ordered by analysis specification

Reducing the number of factors

Factor Analysis is a data reduction technique, its purpose is to reduce the initial number of variables to a more manageable number by creating “composite” variables. By default, it produces the same number of factors as there are variables. In a long list of source variables it is likely that several factors will appear to be influenced by the same set of variables. The apparent duplication will usually occur in factors with small eigenvalues. These factors can usually be discarded as being relatively insignificant.

To discard values, you use the Cutoff settings.

Number of factors

Keep the specified number of factors, selecting those with the highest eigenvalues.

Eigenvalue

Keep the factors with an eigenvalue above the specified value. Some theorists use the rule that factors with an eigenvalues of less than 1.0 should be ignored.

Proportion

Keep the factors with a proportion above the specified value. Some theorists use the rule that factors with a proportion less than a certain amount, e.g. 10%, should be ignored.

Cumulative Proportion

Keep the factors which are required to achieve the specified cumulative proportion. Some theorists use the rule that factors should be included up to a specified cumulative proportion, e.g. 80%.

Weighting in factor analysis

You can apply a weight to factor analysis as you can to other statistics.

For example, if the input variables are rating scale questions with 1 as Very Good and 5 as Very Poor, it is worth specifying a weight to the data so that very good scores have a high value and very poor scores have a low value in the resulting factors.

Applying a weight in the Scale box means:

  • the weight will be applied to all multi-choice and grid questions before the factor analysis is performed
  • the weight will NOT be applied to quantity questions

If all your source variables are rating scale questions, the weight will be applied to them all, and this will have no effect on the Eigenvalues or Factor Loadings that are produced.

If you have a mixture of variable types, the weight will only be applied to multi-choice and grid questions, and factor analysis results will differ from the same analysis without the Scale applied.

In the box labelled Scale, enter the name of a suitable weight, for example, one which weights codes 1 to 5 from -2 to +2.

The post Introduction to factor analysis appeared first on SnapSurveys.

]]>
Introduction to group variables https://www.snapsurveys.com/support-snapxmp/snapxmp/introduction-to-group-variables/ Wed, 21 Oct 2020 14:39:15 +0000 https://www.snapsurveys.com/support-snapxmp/?post_type=epkb_post_type_1&p=3027 You can link similar questions into a group. This means that you can then analyse the group of questions as a whole. For example, if you have a series of questions on aspects of personality, you could group all the questions associated with team working together, and use them as a single axis for tables […]

The post Introduction to group variables appeared first on SnapSurveys.

]]>
You can link similar questions into a group. This means that you can then analyse the group of questions as a whole. For example, if you have a series of questions on aspects of personality, you could group all the questions associated with team working together, and use them as a single axis for tables or charts.

Group variables may be used to create summary analyses of a number of variables which share a common code list such as might be found in a question grid.

For example, you could create a horizontal stacked bar chart to show the results of respondents’ opinion of five aspects of service offered.

You can use Group variables to include a summary of all five variables in the same chart. All responses will be added together when calculating means, so variables with a larger number of responses will have a larger percentage.

  1. Click AnalysisVariablesIcon.png on the main toolbar to open the Analysis variables window.
  2. Click NewSurveyIcon.png and select New Group Variable from the drop-down list. The Group Variable Details dialog opens.
Error shown for a group variable
  1. Give the group variable:
    • Name: GV2
    • Label: Overall opinion
    • Source: list the variables to group e.g. Q6a~Q6e, or Q1, Q5, Q8
Create a group variable
  1. Click SaveIcon.png to save the variable.
  2. It appears in the list of analysis variables.
Analysis variables list
  1. Click VariablesIcon.png to display the Variables window.
  2. Click NewSurveyIcon.png to add a new variable. This is a blank variable that you can use to insert a blank line in your chart.
  3. Specify the Variable Details:
    • Name: Blank
    • Label: Calculated difference
    • Type: Derived (the variable will derive its data from other existing variables).
    • Response: Single (there will be one response for each case).
Creating a blank derived variable
  1. Click SaveIcon.png to save the new variable.
  2. Click AnalysisChartIcon.png to build a chart.
  3. Specify the Definition details:
    • Style: Horizontal Stacked Bar Percent Transposed
    • Analysis: GV2, Blank, Q6a, Q6b, Q6c, Q6d, Q6e
    • Transpose: Checked
    • Show Options: Check Analysis Percents
Analysis Definition for a group variable
  1. Select the Cells tab and set the decimal places for the Percentages to 2. This reduces rounding errors when calculating the percentages.
Setting the decimal places and accuracy in the analysis
  1. Click OK to display the chart.
Chart for a group variable

The post Introduction to group variables appeared first on SnapSurveys.

]]>
Introduction to Auto Category variables https://www.snapsurveys.com/support-snapxmp/snapxmp/introduction-to-auto-category-variables/ Wed, 21 Oct 2020 14:31:11 +0000 https://www.snapsurveys.com/support-snapxmp/?post_type=epkb_post_type_1&p=3011 An Auto Category variable generates a list of the most frequently used words or code labels from a question’s response data. Auto Category variables can be used to analyse open-ended questions or multiple response closed questions. They are often used with word clouds. Auto category variables can help you to Auto category variable and word […]

The post Introduction to Auto Category variables appeared first on SnapSurveys.

]]>
An Auto Category variable generates a list of the most frequently used words or code labels from a question’s response data. Auto Category variables can be used to analyse open-ended questions or multiple response closed questions. They are often used with word clouds.

Auto category variables can help you to

  • Limit the number of codes to those with the highest number of responses. The codes may change when the data changes.
  • Create an “Other” category to group the codes that are outside the limit.
  • You can choose to remove words that add nothing to the analysis, by adding them to the list of stop words.
  • Order the codes by the number of counts
  • Set options on how to categorise words in the response

Auto category variable and word clouds

Creating an Analysis Cloud automatically adds an Auto Category variable.

  1. In the Survey Overview, open the survey required.
  2. Click Analysis Cloud  AnalysisCloudIcon.png  on the Snap XMP Desktop toolbar. This opens the Analysis Definition window to create a word cloud.
  3. In Analysis, enter the question name. This question usually contains comments given by the respondent.
  1. Click Save SaveIcon.png to save the Analysis Cloud.
  2. Click Analysis VariablesAnalysisVariablesIcon.png on the Snap XMP Desktop toolbar to show the list of analysis variables. The Auto Category variable, named AV.Q7, was automatically created when you saved the Analysis Cloud. This is shown in the list of analysis variables.

Editing the Auto Category variable

You are able to edit most of the settings in the Auto Category variable.

  1. Click Analysis VariablesAnalysisVariablesIcon.png on the Snap XMP Desktop toolbar. This opens the Analysis Variables window. All the Auto Category variables have the Method “AutoCategory Variable” in the list.
  2. Double-click the Auto Category variable that you wish to edit. This opens the Auto Category Variable Details window.
  1. If any analysis depends on the variable then the name is read-only, otherwise you can edit the name.
  2. Changing the Source question may change the name, if any analysis uses the original Auto Category variable. This creates a new Auto Category variable when you save.
  3. Limit codes sets the maximum number of codes for use, for example, a word cloud analysis will show the top 25 most commonly used words. Categories that are outside the limit codes setting appear with the category number in brackets.
  4. With Other groups the codes outside the Limit codes setting. For example, if Limit codes is 25, only the top 25 codes are shown and the remaining codes are grouped in the category ‘Other’.
  5. Select Order by counts to order the codes by the number of times they occur in the responses. You can also sort the categories by clicking on the column header.
  6. Click SaveIcon.png to save your changes.

Include or exclude categories

  1. Open the Auto Category Variable Details window.
  2. Select Change codes ch_codes to display the Include column.
Including codes in the Auto Category variable details
  1. Clear Include to exclude the code or select it to include the code in the analysis, such as a word cloud. The excluded code moves to the bottom of the list.
  2. Click Save SaveIcon.png to save the changes.

Stopping unwanted words

When a respondent enters a comment, the response often includes common words, such as ‘the’ or ‘and’. The default stop words list includes many of these words, which means they will not appear in the Auto Category variable or the word cloud. You can add your own stop words and you can also choose to include any of the default stop words.

  1. Click the Stop words button to open the Stop Words dialog where you can add stop words or allow a default stop word.
Stop Words dialog
  1. In Stop words, type any words you wish to exclude. Use spaces to separate the words.
  2. In Allow stopped words, type any words from the Default stop words that you wish to include. Use spaces to separate the words. This is only relevant when using the Default stops words list.
  3. Clear the Use checkbox if you do not want to use the Default stop words list. The default is to use the Default stop words list.
  4. Click OK to save your changes.

Options

You can choose different ways of categorizing the multiple response or open-ended questions by selecting different options.

  1. Click the Options button. This opens the Auto category Options dialog.
Auto category options
  1. In the Multi Choice section, select Separate Codes to treat each code in the response as separate, otherwise the codes are categorized as a group.
  2. Select Case Sensitive if you want the same word entered in different cases to be in a different category, otherwise make sure this is clear.
  3. Select Separate Words unless you know that there will be common phrases (e.g., zip codes) that you wish to use as codes.
  4. Select Customise Delimiters if you want to specify the delimiters.
  5. Click OK to save the options.

Creating an Auto category variable for a chart

An Auto Category Variable can also be used as the source variable for a standard chart such as a bar chart.

Create the Auto Category Variable

  1. Click Analysis Variables AnalysisVariablesIcon.png to show the list of Analysis Variables.
  2. Click New Analysis Variables Item NewSurveyIcon.png on the Snap XMP Desktop toolbar then select New Auto Category Variable from the list. This opens the Auto Category Variable Details dialog.
  3. In Name, enter a name for the Auto category variable. In this example, the name is AVQ4a as the source question is Q4a.
  4. In Source, enter the source question. When you tab off the field the label will update with the question text.
  1. In Limit codes enter the maximum number of codes. This determines the number of bars in the chart.
  2. Check Order by counts to order from the highest to lowest number of counts.
  3. Click Save SaveIcon.png to save the Auto Category Variable.

Create the Analysis Chart

  1. Click Analysis Chart AnalysisChartIcon.png on the Snap XMP Desktop toolbar. This opens the Analysis Definition dialog.
  2. In Analysis, enter the Auto Category variable name, for example, “AVQ4a”.
  3. Select Transpose.
  1. Click Apply to update the chart display. This shows a bar chart of the top 5 other items ordered.

The post Introduction to Auto Category variables appeared first on SnapSurveys.

]]>