Using pandas Combining/merging 2 different Excel files/sheets

calendar_today Asked Aug 20, 2014
thumb_up 8 upvotes
history Updated April 16, 2026

Question posted 2014 · +6 upvotes

I am trying to combine 2 different Excel files. (thanks to the post Import multiple excel files into python pandas and concatenate them into one dataframe)

The one I work out so far is:

import os
import pandas as pd

df = pd.DataFrame()

for f in ['c:\file1.xls', 'c:\ file2.xls']:
    data = pd.read_excel(f, 'Sheet1')
    df = df.append(data)

df.to_excel("c:\all.xls")

Here is how they look like.

enter image description here

However I want to:

  1. Exclude the last rows of each file (i.e. row4 and row5 in File1.xls; row7 and row8 in File2.xls).
  2. Add a column (or overwrite Column A) to indicate where the data from.

For example:

enter image description here

Is it possible? Thanks.

Accepted answer +8 upvotes

For num. 1, you can specify skip_footer as explained here; or, alternatively, do

data = data.iloc[:-2]

once your read the data.

For num. 2, you may do:

from os.path import basename
data.index = [basename(f)] * len(data)

Also, perhaps would be better to put all the data-frames in a list and then concat them at the end; something like:

df = []
for f in ['c:\file1.xls', 'c:\ file2.xls']:
    data = pd.read_excel(f, 'Sheet1').iloc[:-2]
    data.index = [os.path.basename(f)] * len(data)
    df.append(data)

df = pd.concat(df)

3 code variants in this answer

  • Variant 1 — 1 lines, starts with data = data.iloc[:-2]
  • Variant 2 — 2 lines, starts with from os.path import basename
  • Variant 3 — 7 lines, starts with df = []

Top excel Q&A (6)

+8 upvotes ranks this answer #106 out of 167 excel solutions on this site .