Pentaho Tutorial

Statistical Package for the Social Sciences – SPSS Tutorial: The Ultimate Guide

Last updated on 09th Aug 2022, Blog, Tutorials

About author

Rajesh ((Pentaho Lead / Architect ) )

He is Highly Experienced in Respective Technical Domain with 6+ Years, Also He is a Respective Technical Trainer for Past 5 Years & Share's This Important Articles For us.

(5.0) | 12547 Ratings 2059

SPSS Reporting is a suite (collection of tools) for creating relational and analytical reports. It can be used to transform data into meaningful information. SPSS allows generating reports in HTML, Excel, PDF, Text, CSV, and xml. This tutorial provides a basic understanding of how to generate professional reports using SPSS Report Designer.

Subscribe For Free Demo

Error: Contact form not found.

Audience :

This tutorial is designed for all those readers who want to create, read, write, and modify Dynamic Reports using Java. In addition, it will also be quite useful for those readers who would like to become a Data Analyst.

Prerequisites :

Before you start proceeding with this tutorial, we assume that you have prior exposure to Core Java, Database Concepts, and SQL Queries.

Why do we use SPSS?

SPSS protects the vast amount of existing or big data organizations data. It can handle any kind of data and provides high visualization and data sources. SPSS provides 24/7 community support along with several forum support. It is highly scalable and can serve huge volumes of data scaling to billions of terabytes of data. It has a low integration time and low infrastructure cost as compared to other BI tools in the market such as BIA, IBA, SAS BIA, SAP, and many more. It also has an excellent toolset that provides broad applications.

Features of SPSS

  • It provides Full-time community support from data sources
  • It allows the user to add user-friendly metadata domain to data sources
  • The ad-hoc reporting interface offers a step-by-step wizard for designing simple reports. The output formats include HTML,PDF,XLS,and RTF
  • It allows the users to execute reports at given intervals
  • Connectivity between the reporting tools and the BI server that allows the user to publish the content directly to the BI server
  • SPSS user console web interface is used for managing reports and analyzing views very easily
  • Report designer and design studio is used for fine-tuning of reports and Ad-hoc reporting
  • It offers enhanced functionalities

Requirements to install SPSS

Hardware requirements:

The SPSS software doesn’t have any fixed limits on the system or network hardware. It is easy to install and some recommended system specifications to follow

  • System RAM should be at least 2GB
  • The space in the Hard drive should be at least 1GM 
  • It must contain Dual AMD64 or EM64T core processor

Software requirements:

  • It supports operating systems such as Linux, Windows, Solaris, Mac 
  • The system type can be either 32-bit or 64-bit operating system
  • Required to install Sun JRE 5.0 version
  • System much consists of Advanced web browsers such as Firefox, internet explorer, Chrome, etc.

Start with the BI server:

  • For Windows, click on the start BI server icon.
  • For Linux run start-SPSS script on /biserver-ce/directory

Start with the administrator server:

  • For Windows, click on start BI enterprise server from the start.
  • For Linux go to the command window < run the start-up script in /biserver-ce/administration-console//directory.

Stop the administration server:

  • For Windows just click on the stop bi-server icon
  • For Linux, go to the terminal < go to the installed directory < run stop.bat.

SPSS Administration Console

SPSS administration console provides a central location from which the administration and the SPSS deployments. The console aggregates simplify several various administrative tasks such as scheduling jobs, managing users and roles, and managing services. The administration console provides limited functionality compared to the high featured, only subscription, SPSS Enterprise console. It changes the way the user interacts with SPSS deployments by automating some tasks that can perform manually.

SPSS User Console 

The SPSS User Console is a web-based design environment where the user can analyze data, dashboard reports, create interactive reports, and build integrated dashboards to share Business Intelligence solutions with others. It provides the Design tools that help to develop and redefine the data values that are reported, transformed, modeled, and stored. These tools include the following:

  • Report Designer: Report designer is an advanced report creation tool. It helps to create a complete data-driven report for the user. It gives highly scalable and flexible functionality than the Ad hoc report. It is used to generate detailed perfect pixels reports using virtually any data source
  • Design Studio: Eclipse supports this tool, and it allows you to edit a report or analysis manually. It is used to add modifications to an existing report that cannot be performed by Report Designer.
  • Aggregation Designer: It makes easy deployment and creation of aggregate tables that improves the performance of Mondrian OLAP cubes.
  • Metadata editor: It is used to create metadata models and domains. It is also used to add a custom metadata layer to an existing data source
  • SPSS data integration: It provides the Extract, Transform, and Load(ETL) abilities that enable the process of cleansing, capturing, and storing data by using a uniform and consistent format that is accessible and relevant to end-users and IoT technologies.

SPSS Visualization

The Visualization API provides a unified way to visualize data across the SPSS suite, including PDI, Analyzer, and CDF. It allows the safe and isolated operation between third party applications, business logic, and visualizations.

The Visualization API is built on top of the following Javascript APIs:

  • Data API: It provides integration with data sources in the SPSS platform also with client-side component frameworks
  • Type API: It gives features like validation, metadata support, inheritance, and serialization.
  • Core API: It includes core features such as theming and services, registration, consumption, and localization.

This tool is used to create, deploy, and configure a visualization.

SPSS vs Talend

ApproachMetadata driven approachCode generating approach
Data IntegrationIt provides various ETL capabilities including data migration from the database to the applicationIt made easy use of graphical development environment that gives functional efficiency data integration
Data qualityCollaborated with companies that have leading data solution also with its quality firewall It achieves data quality with tools such as metadata manager, data explorer, pattern manager, etc.
Platform It supports Cloud, Windows, Mac, and mobile, iOS, etcIt supports Windows, Mobile, iOS, Cloud, Mac, etc.
Community supportIt provides strong Community support and also collaborated with Hitachi Vantara offers a 24/7 support portal for customers It also provides strong community support but it is required to register for a technical support account
DocumentationIt supports user manual, component documentation, and installation guide in PDF formatIt supports online documentation along with Pentaho kettle solution
Training Provides training in person, Webinars, and Online SessionsProvides training only through documentation
RepositoryFiles can be stored in a personal system or in an application centralized database repository that can be in XML formatFiles can be stored in a personal system
MonitoringProvides adequate monitoring and logging tools Provides proper monitoring and logging tools

Advantages of using SPSS

  • Easy installation 
  • SPSS BI is an inbuilt tool with the basic concepts that help to work easily
  • Streaming engine architecture that helps to provide the ability to work with huge data volumes
  • Ease of use and highly scalable
  • Offers a user-friendly interface and also various tools to retrieve data from multiple data sources
  • Enterprise Data Integration server provides security integration, robust content management, and scheduling with a complete history of jobs and transformations
  • It has the capability of running on the Hadoop cluster
  • The written JavaScript code in step component can be reused in other components 
  • It provides a single package to work on data
  • It offers a wide range of Business Intelligence capabilities that includes the dashboard, data integration, Reporting, data mining, interactive analysis, etc.
  • It provides 24/7 community support for any technical queries

Disadvantages of using SPSS

  • It is a much slower tool when compared to other BI tools
  • Lack of a unified interface for all the components
  • It offers a limited number of components

What is Pentaho Reporting?

SPSS Reporting is a suite (collection of tools) for creating relational and analytical reporting. Using SPSS, we can transform complex data into meaningful reports and draw information out of them. SPSS supports creating reports in various formats such as HTML, Excel, PDF, Text, CSV, and xml.

SPSS can accept data from different data sources including SQL databases, OLAP data sources, and even the SPSS Data Integration ETL tool.

Course Curriculum

Take Your Career to Next Level with Pentaho Training to Build Your Skills

  • Instructor-led Sessions
  • Real-life Case Studies
  • Assignments
Explore Curriculum

Pentaho – Reporting Elements :

Most reporting elements can easily be added by dragging and dropping them from the Data pane to any of the bands on the workspace (mostly Details band).

Let us continue with the same example taken from the previous chapter. There we have added a data source and a query to the Reporting Designer. Here we will design the report based on the output produced by the query.

The resultant query fields are the reporting elements which are highlighted in the following screenshot. Those are − id, name, designation, department, and age.

Adding Reporting Elements :

After adding the query to the Reporting designer, the resultant fields appear in the data pane, as shown in the following screenshot.

Now, drag the required fields (fields you want to display in the report) from the Structure Pane into the Details Band at the center of the main workspace.

Take a look at the following screenshot. It shows the direction to drag the age field from the structure pane.

After arranging all the fields in the Details band, you can see the report view by clicking the view button which is pointed as “1” in the above screenshot.

After clicking the view button, the result report will be as shown in the following screenshot. In the workspace, you will find the values of all the fields 

Pentaho – Page Footer Fields :

Each page of a report contains two special areas. At the top of every page, you will find the page-header area. And at the bottom of the page, you will find the page-footer area. The remaining page is available for the actual report content.

Adding Page Footer Fields in Pentaho :

The page footer tab is used to present some attributes and functions. For example, you can use the page footer tab to print the max value of the age field at the bottom of the page. The reporting engine allows these features by using reporting functions.

Take a look at the following screenshot. Here the Data tab in the structure pane includes a symbol fx (marked as “1”). It is the add function button. Click this button to add different functions into the report.

Pentaho – Groups :

SPSS offers various functionalities and features to convert raw data into meaningful information. Here, we will learn how to use one such feature Groups. You can use this feature to segregate raw row-set data into groups so that the user can easily understand the report.

  • Grouping is a great way to divide long lists of data along meaningful separators.
  • With groups, you can keep similar items together and visually separate these items from other groups of items.
  • You will also need groups to perform various aggregations over the data, like printing the number of items in a group or calculating sums or averages.

Adding Groups in Pentaho :

We already have a report based on the employee data. Here the query is to group all employee records according to “department”. Follow the procedure given below to fulfill the given requirement.

First of all, you start the process by clicking the Add group button on the Structure pane. The Add group button is marked as “1” in the following screenshot.

Pentaho – Chart Report:

A chart, also called a graph, is a graphical representation of data. SPSS Reporting Designer offers a wide variety of chart types. You can design a chart using the “chart-element” option available in the palette of SPSS Reporting Designer.

There are three requirements to create a chart −

  • A data-collector to extract the charting-data from the data sources.
  • A chart-expression to produce a chart from the collected data.
  • A report element to display the resulting chart object.

Let us now take an example and try to understand the process of creating charts in SPSS.

First of all, create a table named car based on the given table data. We are using MySQL database as the data source. Add that (data source) to the SPSS Reporting Designer. We have already discussed how to add a data source and a query to the SPSS Reporting Designer in the chapter “SPSS – Data Sources and Queries”.

Steps to Create a Chart in Pentaho :

Step 1 – Add a Query

The query will retrieve all the records from the table car. Therefore the query should be as follows −

Step 2 – Add an Image to the Page Header

This is an optional step, however it is important because images play an important role in improving the aesthetics of a report.

Take a look at the following screenshot. We have highlighted the following five activities −

  • After adding a query, you will find the query fields on the Structure pane. From the Structure pane, select the respective fields, drag and drop into the Details tab of the report workspace.
  • The Details tab presents only the field values. Add those respective heading Labels to the Report Header tab by selecting the label field from the palette.
  • Add an image taken from the palette and place it on the Page Header Tab. Add another label in the Page Header for Report Heading and use a suitable heading, for example CAR – CHART, for effective presentation. Double-click on the image element – you will find a dialogue box called Edit Content.
  • You have two options to insert an image. Either link the image URL to the report or embed the image into the report. We chose to embed the image by selecting the option “Embed in Report”.
  • Download some sample car image from the internet to put into the Page Header. Click on the button to locate the sample car image with the location URL. Click the OK button to confirm.

Step 3 – Add Chart into Report Footer

Add the chart by clicking select and dragging it from the left-side palette to the Report. It is marked as “1” in the following screenshot. Double-click on the Bar chart element on the Report Footer.

Step 4 – Add Chart Properties

After double-clicking on the chart element, you will find a dialogue box where you have to provide the data collected details and chart expression details.

Take a look at the following screenshot. The tab Primary Data Source contains two markers −

  • Marker “1” is a dropdown list where you have to select the Category Set Data Collector.
  • Marker “2” is also a dropdown list of the category-column where you have to select the name field.

There are three sections in the Primary Data Source − Common, Series, and Group. Here, we don’t require to add anything on the Groups section because we are not using any Groups in our query.

Common − There are two fields in this section − category-column and value-column. We already filled the category-column value with the name field in the above section. The second one is value-column.

Click on the empty value; you will find a dialogue box as shown in the following screenshot. There are two activities (1 and 2) marked in it.

  • Click the (+) button to add the value fields in the column.
  • By clicking on the empty value, you will find a dropdown list from where you need to select the speed field.

Repeat the above two activities to add user_rating, mileage, and safety fields into the column. After adding all these fields, the screen will appear as shown below. Click OK to confirm.

The next column in the Primary Data source section is the Series section.

Series − In the Series field, click the series-by-value option. You will find a dialogue box as shown in the following screenshot. There are two markers (1 and 2) in it.

  • Click the (+) button to add a new field in the column.
  • By double-clicking on it, you can edit that field.

Repeat these two activities for adding field names such as Speed, User Rating, Mileage, and Safety.

These are the user-defined names to present in the particular section of report chart. But here you have follow the same order that you have given on the Value-column fields in the Common section.

After adding all the sections, you will get the Edit Chart dialogue box as shown in the following screenshot. The Bar Chart pane contains different properties which are used for changing the chart format.

Pentaho Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download

Pentaho – Functions

Each page of a report contains a Page-header area and a Page-footer area. The remaining page is available for the actual report content.

The page footer tab is used to present some attributes and functions. For example, we can print the max value of the age field of an employee in the Page-footer. The reporting engine allows these features by using functions.

Let us use the same employee table which we have used in the previous chapters. After adding all the fields into the report workspace, let us now add a function into the Page-footer tab to find out the maximum age of an employee.

Follow the steps given below to add a predefined function to your report.

Step 1 – Click the Function Button

Take a look at the following screenshot. Here the Data tab in the structure pane includes a symbol fx (marked as “1”). It is the add function button. Click this button to add different functions into the report.

Step 2 – Select a Particular Function

Then, you will find a dialog box with different functions segregated into different groups. To print the maximum age of an employee at the page footer, we should choose the Maximum function in the Summary group which is marked as “1” in the following screenshot. Select it and click OK.

Once you click the Ok button, the function will be added to the Functions label in the Structure Pane which is placed at the right side of the screen.

Step 3 – Define a Field Name

Once you select the added function (i.e. Maximum), you will find another pane below the structure pane containing the properties of that function.

Take a look at the following screen. The maximized box contains two pointers (Pointer 1 and Pointer 2).

  • Pointer 1 − Select the function in the data tab of the structure pane.
  • Pointer 2 − Edit the Field name in the properties section by selecting the age field from the dropdown list. It is because we have to print the maximum age of an employee.

Step 4 – Add a Function to Report Workspace

Now, the function is ready with the customized properties. Now you can use that function in your report as a page footer attribute.

Take a look at the following screenshot. Again, it contains two pointers (Pointer 1 and Pointer 2).

  • Pointer 1 − Select and drag the Maximum function from the Structure pane to the page footer band in the workspace, as shown in the following screenshot. Now the design of your report is ready.
  • Pointer 2 − Select the Preview button on the left side of the screen.

Step 5 – Check Preview

Take a look at the following screen. It shows the preview of the report. The maximum age of an employee is marked and shown in a maximized box.

Are you looking training with Right Jobs?

Contact Us

Popular Courses