Welcome back everyone to our new lecture on Pandas Dataframe. If you missed the previous lecture on Pandas Series then please have a look.
Lets start without wasting time…
What is Pandas DataFrame?
Pandas DataFrames are the workhorse of pandas and are directly inspired by the R programming language. We can think of a DataFrame as a bunch of Series objects put together to share the same index. in simple word we can say that A Pandas DataFrame is like a 2-dimensional array which has rows and columns.
You can check the documentation on Pandas DataFrame on its official site pandas.pydata.org.
Let’s use pandas to explore this topic!
import pandas as pd
import numpy as npfrom numpy.random import randn
np.random.seed(101)df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())#print
dfOutput:
Selection and Indexing
Let’s learn the various methods to grab data from a DataFrame.
# Pass a list of column names
df[['W','Z']]Output will be:
# SQL Syntax (NOT RECOMMENDED!)
df.WOutput will be:
A 2.706850
B 0.651118
C -2.018168
D 0.188695
E 0.190794
Name: W, dtype: float64Note: DataFrame Columns are just Series, just for example:

How to create a new column?
Creating a new column:
df['new'] = df['W'] + df['Y']
Removing that new column:
df.drop('new',axis=1)Output:
W X Y Z
A 2.706850 0.628133 0.907969 0.503826
B 0.651118 -0.319318 -0.848077 0.605965
C -2.018168 0.740122 0.528813 -0.589001
D 0.188695 -0.758872 -0.933237 0.955057
E 0.190794 1.978757 2.605967 0.683509We can Can also drop rows this way:
df.drop('E',axis=0) #last row dopedOutput will be:
You can select a row from dataframe in two different ways:

** Selecting subset of rows and columns **
df.loc[['A','B'],['W','Y']]Output:
W Y
A 2.706850 0.907969
B 0.651118 -0.848077Multi-Index and Index Hierarchy
Let us go over how to work with Multi-Index, first we’ll create a quick example of what a Multi-Indexed DataFrame would look like:
# Index Levels
outside = ['G1','G1','G1','G2','G2','G2']
inside = [1,2,3,1,2,3]
hier_index = list(zip(outside,inside))
hier_index = pd.MultiIndex.from_tuples(hier_index)# print the index
hier_indexOutput:
ultiIndex(levels=[['G1', 'G2'], [1, 2, 3]],
labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]])df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])
df
Great Job!
We have touched almost all points of Pandas Dataframe. If you do have any question regarding this topic then please contact us through comment section or you can also mail us.
Thanks 🙂