- HTML Comments Tutorial | Convert Comments into HTML Codes
- Data Structure and Algorithms Tutorial | Ultimate Guide to Learn
- Gradle Tutorial | For Beginners [ STEP-IN ]
- Encapsulation In Java | Complete Tutorial With Examples
- What is Release Management? | A Comprehensive Tutorial for Beginners
- OOPs Concepts in Java | Learn from Basics with Examples
- The Top Basic Tools of Quality Tutorial | The Ultimate Guide
- Set Environment for C# – Learn How to Setup through this Tutorial
- C# Vs Java Tutorial | Overview and Key Difference
- List of IDEs to run C# Programs | Tutorial for Learning Path
- C Sharp Variables and Constants | The Ultimate Guide
- Unsafe Code in C-Sharp Tutorial | Everything You Need to Know
- Type Conversion Method in C# | A Complete Tutorial
- What Is Synchronization in c# Tutorial | The BEST Step-By-Step Guide
- Understanding Structures in C# | Learn in 1 Day FREE Tutorial
- Strings – C# Tutorial | A Definitive Programming Guide
- Static Keyword in C# Tutorial | Learn with Examples
- Stack Collection in C# Tutorial | A Definitive Guide for Beginners
- C# Sorted List Tutorial with Examples | Learn in 1 Day FREE
- C# Serialization Tutorial | Ultimate Guide to Learn [BEST & NEW]
- Regular Expression in C# Tutorial | Everything You Need to Know
- What is Reflection in C#? | Learn Now Tutorial
- Queue Collection in C# Tutorial | A Definitive Guide
- Properties in C# | The complete Tutorial
- C# Preprocessor Directives Tutorial | Learn in 1 Day FREE
- Polymorphism C# Tutorial | The Ultimate Guide
- C# Operators Tutorial | Learn Arithmetic, Comparison, Logical Concepts
- Namespaces – The complete C# Tutorial
- Multithreading in C# Tutorial | Learn With Examples and Advantages
- Methods – C# Tutorial | A Complete Programming Guide
- Linked List Implementation in C# Tutorial | Ultimate Guide to Learn [UPDATED]
- Introduction to C# Tutorial | Guide for Beginners
- What is Interface in C# | A Defined Free Tutorial
- C# Inheritance Tutorial | A Complete Free Tutorial
- Indexers in C# Tutorial | A Complete Programming Guide
- HashSet Collection in C# Tutorial | Complete Guide Tutorial For Free
- Generics in C# Tutorial | Learn Generic Classes and Methods
- Creating Your First C# Program Tutorial | Learn in 1 Day
- Basics of File Handling in C# Tutorial | The Ultimate Guide
- C# Exception Handling Tutorial | Learn with Best Practices
- Events – C# Tutorial | A Complete Programming Guide
- C# Enumerations Type Tutorial | Learn Everything about Enum
- Dictionary Collection in C# | Ultimate Guide to Learn [NEW & UPDATED]
- Delegates – C# Programming Guide | The Ultimate Guide for Beginners
- Understanding Decision Making Statements in C# | Learn Now Tutorial
- Classes and Objects – C# Fundamentals Tutorial
- C# BitArray Collection Tutorial | Learn in 1 Day FREE
- Attributes in C# Tutorial | Learn to work with attributes in C#
- C# Array Tutorial | Create, Declare, Initialize
- ArrayList Collection on in C# | A Complete tutorial For Beginners
- Anonymous Methods and Lambdas – C# Tutorial | A Complete Guide
- Abstraction in C# Tutorial – Learn the Abstract class and Interface
- Game Development using Unity 3D Tutorial | Ultimate Guide to Learn [UPDATED]
- C++ Reference Tutorial | A Comprehensive Guide for Beginners
- PHP vs Python | Which Is Better For Web Development
- C++ Constructors Tutorial: Types and Copy Constructors
- JavaScript Arrays Tutorial | Complete Beginner’s Guide
- What Is Maven | Maven Tutorial For Beginners
- Spring Tutorial | Perfect Guide for Beginners
- React Hooks Tutorial for Beginners | Ultimate Guide to Learn
- Python for Data Science Tutorial | Quickstart : A Complete Guide
- What is Golang? : A tutorial for beginners | Get started
- Hibernate Validator Tutorial | Learn in 1 Day FREE
- Postman Tutorial for Beginners: API Testing using Postman | A Complete Guide
- Akka Tutorial
- J2EE | Web Development Tutorial for Beginners
- Scala Exception Handling Tutorial | Learn in 1 Day [ STEP-IN ]
- Web development Tutorial
- Visual Studio Tutorial
- PyGame Tutorial
- Python Anaconda Tutorial
- Python Scikit-Learn Cheat Sheet Tutorial
- Mean Stack Tutorial
- Python Requests Tutorial
- Advanced Java Tutorial
- Spring Boot Microservices Tutorial
- Java Servlets Tutorial
- How to Install Pycharm
- Pycharm Tutorial
- Python Version Tutorial
- Python strings
- How to Download Python
- C Data Types Tutorial
- arrays in python
- Python While Loop Tutorial
- JAVA Tutorial
- Loops In C Tutorial
- Java File I/O Tutorial
- Variables in Python Tutorial
- Python Tutorial
- Python Pandas Cheat Sheet Tutorial
- Data Structures Cheat Sheet with Python Tutorial
- Python Tuples Tutorial
- Python If Else Statements Tutorial
- Python Functions Tutorial
- HTML Comments Tutorial | Convert Comments into HTML Codes
- Data Structure and Algorithms Tutorial | Ultimate Guide to Learn
- Gradle Tutorial | For Beginners [ STEP-IN ]
- Encapsulation In Java | Complete Tutorial With Examples
- What is Release Management? | A Comprehensive Tutorial for Beginners
- OOPs Concepts in Java | Learn from Basics with Examples
- The Top Basic Tools of Quality Tutorial | The Ultimate Guide
- Set Environment for C# – Learn How to Setup through this Tutorial
- C# Vs Java Tutorial | Overview and Key Difference
- List of IDEs to run C# Programs | Tutorial for Learning Path
- C Sharp Variables and Constants | The Ultimate Guide
- Unsafe Code in C-Sharp Tutorial | Everything You Need to Know
- Type Conversion Method in C# | A Complete Tutorial
- What Is Synchronization in c# Tutorial | The BEST Step-By-Step Guide
- Understanding Structures in C# | Learn in 1 Day FREE Tutorial
- Strings – C# Tutorial | A Definitive Programming Guide
- Static Keyword in C# Tutorial | Learn with Examples
- Stack Collection in C# Tutorial | A Definitive Guide for Beginners
- C# Sorted List Tutorial with Examples | Learn in 1 Day FREE
- C# Serialization Tutorial | Ultimate Guide to Learn [BEST & NEW]
- Regular Expression in C# Tutorial | Everything You Need to Know
- What is Reflection in C#? | Learn Now Tutorial
- Queue Collection in C# Tutorial | A Definitive Guide
- Properties in C# | The complete Tutorial
- C# Preprocessor Directives Tutorial | Learn in 1 Day FREE
- Polymorphism C# Tutorial | The Ultimate Guide
- C# Operators Tutorial | Learn Arithmetic, Comparison, Logical Concepts
- Namespaces – The complete C# Tutorial
- Multithreading in C# Tutorial | Learn With Examples and Advantages
- Methods – C# Tutorial | A Complete Programming Guide
- Linked List Implementation in C# Tutorial | Ultimate Guide to Learn [UPDATED]
- Introduction to C# Tutorial | Guide for Beginners
- What is Interface in C# | A Defined Free Tutorial
- C# Inheritance Tutorial | A Complete Free Tutorial
- Indexers in C# Tutorial | A Complete Programming Guide
- HashSet Collection in C# Tutorial | Complete Guide Tutorial For Free
- Generics in C# Tutorial | Learn Generic Classes and Methods
- Creating Your First C# Program Tutorial | Learn in 1 Day
- Basics of File Handling in C# Tutorial | The Ultimate Guide
- C# Exception Handling Tutorial | Learn with Best Practices
- Events – C# Tutorial | A Complete Programming Guide
- C# Enumerations Type Tutorial | Learn Everything about Enum
- Dictionary Collection in C# | Ultimate Guide to Learn [NEW & UPDATED]
- Delegates – C# Programming Guide | The Ultimate Guide for Beginners
- Understanding Decision Making Statements in C# | Learn Now Tutorial
- Classes and Objects – C# Fundamentals Tutorial
- C# BitArray Collection Tutorial | Learn in 1 Day FREE
- Attributes in C# Tutorial | Learn to work with attributes in C#
- C# Array Tutorial | Create, Declare, Initialize
- ArrayList Collection on in C# | A Complete tutorial For Beginners
- Anonymous Methods and Lambdas – C# Tutorial | A Complete Guide
- Abstraction in C# Tutorial – Learn the Abstract class and Interface
- Game Development using Unity 3D Tutorial | Ultimate Guide to Learn [UPDATED]
- C++ Reference Tutorial | A Comprehensive Guide for Beginners
- PHP vs Python | Which Is Better For Web Development
- C++ Constructors Tutorial: Types and Copy Constructors
- JavaScript Arrays Tutorial | Complete Beginner’s Guide
- What Is Maven | Maven Tutorial For Beginners
- Spring Tutorial | Perfect Guide for Beginners
- React Hooks Tutorial for Beginners | Ultimate Guide to Learn
- Python for Data Science Tutorial | Quickstart : A Complete Guide
- What is Golang? : A tutorial for beginners | Get started
- Hibernate Validator Tutorial | Learn in 1 Day FREE
- Postman Tutorial for Beginners: API Testing using Postman | A Complete Guide
- Akka Tutorial
- J2EE | Web Development Tutorial for Beginners
- Scala Exception Handling Tutorial | Learn in 1 Day [ STEP-IN ]
- Web development Tutorial
- Visual Studio Tutorial
- PyGame Tutorial
- Python Anaconda Tutorial
- Python Scikit-Learn Cheat Sheet Tutorial
- Mean Stack Tutorial
- Python Requests Tutorial
- Advanced Java Tutorial
- Spring Boot Microservices Tutorial
- Java Servlets Tutorial
- How to Install Pycharm
- Pycharm Tutorial
- Python Version Tutorial
- Python strings
- How to Download Python
- C Data Types Tutorial
- arrays in python
- Python While Loop Tutorial
- JAVA Tutorial
- Loops In C Tutorial
- Java File I/O Tutorial
- Variables in Python Tutorial
- Python Tutorial
- Python Pandas Cheat Sheet Tutorial
- Data Structures Cheat Sheet with Python Tutorial
- Python Tuples Tutorial
- Python If Else Statements Tutorial
- Python Functions Tutorial
Python Pandas Cheat Sheet Tutorial
Last updated on 26th Sep 2020, Blog, Software Engineering, Tutorials
Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool using its powerful data structures. The name Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data.
In 2008, developer Wes McKinney started developing pandas when in need of high performance, flexible tool for analysis of data.
Prior to Pandas, Python was majorly used for data munging and preparation. It had very little contribution towards data analysis. Pandas solved this problem. Using Pandas, we can accomplish five typical steps in the processing and analysis of data, regardless of the origin of data — load, prepare, manipulate, model, and analyze.
Python with Pandas is used in a wide range of fields including academic and commercial domains including finance, economics, Statistics, analytics, etc.
Key Features of Pandas
- Fast and efficient DataFrame object with default and customized indexing.
- Tools for loading data into in-memory data objects from different file formats.
- Data alignment and integrated handling of missing data.
- Reshaping and pivoting of date sets.
- Label-based slicing, indexing and subsetting of large data sets.
- Columns from a data structure can be deleted or inserted.
- Group by data for aggregation and transformations.
- High performance merging and joining of data.
- Time Series functionality.
Pandas deals with the following three data structures −
- Series
- DataFrame
- Panel
These data structures are built on top of Numpy array, which means they are fast.
Dimension & Description
The best way to think of these data structures is that the higher dimensional data structure is a container of its lower dimensional data structure. For example, DataFrame is a container of Series, Panel is a container of DataFrame.
| Data Structure | Dimensions | Description |
|---|---|---|
| Series | 1 | 1D labeled homogeneous array, sizeimmutable. |
| Data Frames | 2 | General 2D labeled, size-mutable tabular structure with potentially heterogeneously typed columns. |
| Panel | 3 | General 3D labeled, size-mutable array. |
Building and handling two or more dimensional arrays is a tedious task, burden is placed on the user to consider the orientation of the data set when writing functions. But using Pandas data structures, the mental effort of the user is reduced.
For example, with tabular data (DataFrame) it is more semantically helpful to think of the index (the rows) and the columns rather than axis 0 and axis 1.
Mutability
All Pandas data structures are value mutable (can be changed) and except Series all are size mutable. Series is size immutable.
Note − DataFrame is widely used and one of the most important data structures. Panel is used much less.
Series
Series is a one-dimensional array like structure with homogeneous data. For example, the following series is a collection of integers 10, 23, 56, …
- 10235617526173902672
Key Points
- Homogeneous data
- Size Immutable
- Values of Data Mutable
Subscribe For Free Demo
Error: Contact form not found.
DataFrame
DataFrame is a two-dimensional array with heterogeneous data. For example,
| Name | Age | Gender | Rating |
|---|---|---|---|
| Steve | 32 | Male | 3.45 |
| Lia | 28 | Female | 4.6 |
| Vin | 45 | Male | 3.9 |
| Katie | 38 | Female | 2.78 |
The table represents the data of a sales team of an organization with their overall performance rating. The data is represented in rows and columns. Each column represents an attribute and each row represents a person.
Data Type of Columns
The data types of the four columns are as follows −
| Column | Type |
|---|---|
| Name | String |
| Age | Integer |
| Gender | String |
| Rating | Float |
Key Points
- Heterogeneous data
- Size Mutable
- Data Mutable
Panel
Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the panel in graphical representation. But a panel can be illustrated as a container of DataFrame.
Key Points
- Heterogeneous data
- Size Mutable
- Data Mutable
Import Convention:
We need to import the library before we get started.
import pandas as pd
Pandas Data Structure:
We have two types of data structures in Pandas, Series and DataFrame.
Series is a one-dimensional labeled array that can hold any data type.
DataFrame is a two-dimensional, potentially heterogeneous tabular data structure.
Or we can say Series is the data structure for a single column of a DataFrame
Now let us see some examples of Series and DataFrames for better understanding.
Series:s = pd.Series([1, 2, 3, 4], index=[‘a’, ‘b’, ‘c’, ‘d’])
Data Frame:
- data_mobile = {‘Mobile’: [‘iPhone’, ‘Samsung’, ‘Redmi’], ‘Color’: [‘Red’, ‘White’, ‘Black’], ‘Price’: [High, Medium, Low]}
- df = pd.DataFrame(data_mobile, columns=[‘Mobile’, ‘Color’, ‘Price’])
Importing Convention:
Pandas library offers a set of reader functions that can be performed on a wide range of file
- pd.read_csv(“filename”)
- pd.read_table(“filename”)
- pd.read_excel(“filename”)
- pd.read_sql(query, connection_object)
- pd.read_json(json_string)
formats which returns a Pandas object. Here we have mentioned a list of reader functions.
Similarly, we have a list of write operations which are useful while writing data into a file.
- df.to_csv(“filename”)
- df.to_excel(“filename”)
- df.to_sql(table_name, connection_object)
- df.to_json(“filename”)
Create Test/Fake Data:
Pandas library allows us to create fake or test data in order to test our code segments. Check out the examples given below.
- pd.DataFrame(np.random.rand(4,3)) – 3 columns and 4 rows of random floats
- pd.Series(new_series) – Creates a series from an iterablenew_series
Operations:
Here we have mentioned various inbuilt functions and their operations.
View DataFrame contents:
- df.head(n) – look at first n rows of the DataFrame.
- df.tail(n) – look at last n rows of the DataFrame.
- df.shape() – Gives the number of rows and columns.
- df.info() – Information of Index, Datatype and Memory.
- df.describe() –Summary statistics for numerical columns.
Selecting:
we want to select and have a look at a chunk of data from our DataFrame. There are two ways of achieving the same.
First, selecting by position and second, selecting by label.
- Selecting by position using iloc:
- df.iloc[0] – Select first row of data frame
- df.iloc[1] – Select second row of data frame
- df.iloc[-1] – Select last row of data frame
- df.iloc[:,0] – Select first column of data frame
- df.iloc[:,1] – Select second column of data frame
- Selecting by label using loc:
- df.loc([0], [column labels])-Select single value by row position & column labels
- df.loc[‘row1′:’row3’, ‘column1′:’column3’]-Select and slicing on labels
Sorting:
Another very simple yet useful feature offered by Pandas is the sorting of DataFrame.
- df.sort_index() -Sorts by labels along an axis
- df.sort_values(column1) – Sorts values by column1 in ascending order
- df.sort_values(column2,ascending=False) – Sorts values by column2 in
Groupby:
Using groupby technique you can create a grouping of categories and then it can be helpful while applying a function to the categories. This simple yet valuable technique is used widely in data science.
- df.groupby(column) – Returns a groupby object for values from one column
- df.groupby([column1,column2]) – Returns a groupby object values from multiple columns
- df.groupby(column1)[column2].mean() – Returns the mean of the values in column2, grouped by the values in column1
- df.groupby(column1)[column2].median() – Returns the mean of the values in column2, grouped by the values in column1
Functions:
There are some special methods available in Pandas which makes our calculation easier. Let’s
- Mean:df.mean() – mean of all columns
- Median:df.median() – median of each column
- Standard Deviation:df.std() – standard deviation of each column
- Max:df.max() – highest value in each column
- Min:df.min() – lowest value in each colum
- Count:df.count() – number of non-null values in eachDataFrame column
- Describe:df.describe() – Summary statistics for numericalcolumns
- apply those methods in our Product_ReviewDataFrame
Plotting:
Data Visualization with Pandas is carried out in the following ways.
- Histogram
- Scatter Plot
Note: Call %matplotlib inline to set up plotting inside the Jupyter notebook.
Histogram: df.plot.hist()
Scatter Plot:df.plot.scatter(x=’column1′,y=’column2′)
Conclusion
I hope this can be a reference guide for you as well. I’ll try to continuously update this as I find more useful pandas functions.If there are any functions you can’t live without please post them in the comments below!