Pandas DataFrame - Python Tutorials

Welcome back everyone to our new lecture on Pandas Dataframe. If you missed the previous lecture on Pandas Series then please have a look.

Lets start without wasting time…

What is Pandas DataFrame?

Pandas DataFrames are the workhorse of pandas and are directly inspired by the R programming language. We can think of a DataFrame as a bunch of Series objects put together to share the same index. in simple word we can say that A Pandas DataFrame is like a 2-dimensional array which has rows and columns.

You can check the documentation on Pandas DataFrame on its official site pandas.pydata.org.

Let’s use pandas to explore this topic!

import pandas as pd
import numpy as np

from numpy.random import randn
np.random.seed(101)

df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())

#print
df

Output:

Selection and Indexing

Let’s learn the various methods to grab data from a DataFrame.

# Pass a list of column names
df[['W','Z']]

Output will be:

# SQL Syntax (NOT RECOMMENDED!)
df.W

Output will be:
A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

Note: DataFrame Columns are just Series, just for example:

How to create a new column?

Creating a new column:

df['new'] = df['W'] + df['Y']

Removing that new column:

df.drop('new',axis=1)

Output:
	W	           X	           Y	          Z
A	2.706850	0.628133	0.907969	0.503826
B	0.651118	-0.319318	-0.848077	0.605965
C	-2.018168	0.740122	0.528813	-0.589001
D	0.188695	-0.758872	-0.933237	0.955057
E	0.190794	1.978757	2.605967	0.683509

We can Can also drop rows this way:

df.drop('E',axis=0) #last row doped

Output will be:

You can select a row from dataframe in two different ways:

** Selecting subset of rows and columns **
df.loc[['A','B'],['W','Y']]

Output:
	W	        Y
A	2.706850	0.907969
B	0.651118	-0.848077

Multi-Index and Index Hierarchy

Let us go over how to work with Multi-Index, first we’ll create a quick example of what a Multi-Indexed DataFrame would look like:

# Index Levels
outside = ['G1','G1','G1','G2','G2','G2']
inside = [1,2,3,1,2,3]
hier_index = list(zip(outside,inside))
hier_index = pd.MultiIndex.from_tuples(hier_index)

# print the index
hier_index

Output:
ultiIndex(levels=[['G1', 'G2'], [1, 2, 3]],
           labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]])

df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])
df

Great Job!

We have touched almost all points of Pandas Dataframe. If you do have any question regarding this topic then please contact us through comment section or you can also mail us.

Thanks 🙂

technicalblog.in

Pandas DataFrame – Python Tutorials

What is Pandas DataFrame?

Selection and Indexing

How to create a new column?

Multi-Index and Index Hierarchy

Leave a Reply Cancel reply

Pages

What is Pandas DataFrame?

Selection and Indexing

How to create a new column?

Multi-Index and Index Hierarchy

Leave a Reply Cancel reply

You Missed

Top 30 PostgreSQL Intermediate Interview Questions – Part 2 (11-20)

Top 30 PostgreSQL Intermediate Interview Questions – Part 1 (1-10)

Microsoft PL-300 Power BI Questions Set-2

Microsoft PL-300 Questions Set-1