Indexing in Pandas means selecting rows and columns of data from a Dataframe.
This can be selecting all the rows and the particular number of columns, a particular number of rows, and all the columns or a particular number of rows and columns each.
Let’s create a DataFrame with a few rows and columns and execute some examples to learn using an index.
import pandas as pd import numpy as np games = { 'Game':["Wii Sports","Mario Kart Wii","Tetris","Duck Hunt","The Sims","Grand Theft Auto IV","Gran Turismo 2"], 'Platform' :["Wii","Wii","Gameboy","NES","PC","XBox360","Playstation"], 'Publisher':["Nintendo","Nintendo","Nintendo","Nintendo","Electronic Arts","Take-Two Interactive",'Sony'], 'Year':[2006,2008,1989,1984,2000,2008,1999] } index_labels=['GAME1','GAME2','GAME3','GAME4','GAME5','GAME6','GAME7'] df = pd.DataFrame(games,index=index_labels) print(df)
This will display the following
Game Platform Publisher Year GAME1 Wii Sports Wii Nintendo 2006 GAME2 Mario Kart Wii Wii Nintendo 2008 GAME3 Tetris Gameboy Nintendo 1989 GAME4 Duck Hunt NES Nintendo 1984 GAME5 The Sims PC Electronic Arts 2000 GAME6 Grand Theft Auto IV XBox360 Take-Two Interactive 2008 GAME7 Gran Turismo 2 Playstation Sony 1999
Here are some quick examples
# Select Rows by Integer Index print(df.iloc[2]) # Select Row by Index print(df.iloc[[1,2,4]]) # Select Rows by Index List print(df.iloc[1:3]) # Select Rows by Integer Index Range print(df.iloc[:1] ) # Select First Row print(df.iloc[:2] ) # Select First 2 Rows print(df.iloc[-1:] ) # Select Last Row print(df.iloc[-2:] ) # Select Last 2 Row print(df.iloc[::2] ) # Selects alternate rows # Select Rows by Index Labels print(df.loc['GAME1'] ) # Select Row by Index Label print(df.loc[['GAME1','GAME2','GAME4']] ) # Select Rows by Index Label List print(df.loc['GAME2':'GAME4'] ) # Select Rows by Label Index Range print(df.loc['GAME1':'GAME7':2] ) # Select Alternate Rows with in Index Labels
Select a Row by Integer Index
You can select a single row from pandas DataFrame by integer index using df.iloc[n].
Replace n with the position you wanted to select.
# Select Row by Integer Index print(df.iloc[2]
Get Multiple Rows by Index List
To get multiple rows from a DataFrame by specifies indexes as a list.
For example df.iloc[[1,2,4]] selects rows 2, 3 and 5 as index starts from zero.
# Select Rows by Index List print(df.iloc[[1,2,4]])
Get DataFrame Rows by Index Range
When you want to select a DataFrame by a range of Indexes, you can provide start and stop indexes.
By not providing a start index, iloc[] selects from the first row.
By not providing a stop index, iloc[] selects all rows from the start index.
Providing both start and stop index this will select all rows in between
# Select Rows by Integer Index Range print(df.iloc[1:3]) # Select First Row by Index print(df.iloc[:1]) # Select First 2 Rows print(df.iloc[:2] ) # Select Last Row print(df.iloc[-1:]) # Select Last 2 Rows print(df.iloc[-2:]) # Selects alternate rows print(df.iloc[::2])
Get Rows by Label
If you have index labels on DataFrame, you can use these the label names to select the row. For example df.loc[‘GAME1'] returnsthe row with label ‘GAME1’.
print(df.loc['GAME1'])
Get Multiple Rows by Label List
If you have a list of row labels, you can use this to select multiple rows from pandas DataFrame.
# Select Rows by Index Label List print(df.loc[['GAME1','GAME2','GAME4']])
Get Rows Between Two Labels
You can also select rows between two index labels.
print(df.loc['GAME2':'GAME4'] ) print(df.loc['GAME1':'GAME7':2])
Example
import pandas as pd import numpy as np games = { 'Game':["Wii Sports","Mario Kart Wii","Tetris","Duck Hunt","The Sims","Grand Theft Auto IV","Gran Turismo 2"], 'Platform' :["Wii","Wii","Gameboy","NES","PC","XBox360","Playstation"], 'Publisher':["Nintendo","Nintendo","Nintendo","Nintendo","Electronic Arts","Take-Two Interactive",'Sony'], 'Year':[2006,2008,1989,1984,2000,2008,1999] } index_labels=['GAME1','GAME2','GAME3','GAME4','GAME5','GAME6','GAME7'] df = pd.DataFrame(games,index=index_labels) print(df) print(df.iloc[[1,2,4]]) # Select Rows by Index List print(df.iloc[1:3]) # Select Rows by Integer Index Range print(df.iloc[:1] ) # Select First Row print(df.iloc[:2] ) # Select First 2 Rows print(df.iloc[-1:] ) # Select Last Row print(df.iloc[-2:] ) # Select Last 2 Row print(df.iloc[::2] ) # Selects alternate rows print(df.loc['GAME1'] ) # Select Row by Index Label print(df.loc[['GAME1','GAME2','GAME4']] ) # Select Rows by Index Label List print(df.loc['GAME2':'GAME4'] ) # Select Rows by Label Index Range print(df.loc['GAME1':'GAME7':2] ) # Select Alternate Rows with in Index Labels