Question posted 2014 · +6 upvotes
I am trying to combine 2 different Excel files. (thanks to the post Import multiple excel files into python pandas and concatenate them into one dataframe)
The one I work out so far is:
import os
import pandas as pd
df = pd.DataFrame()
for f in ['c:\file1.xls', 'c:\ file2.xls']:
data = pd.read_excel(f, 'Sheet1')
df = df.append(data)
df.to_excel("c:\all.xls")
Here is how they look like.

However I want to:
- Exclude the last rows of each file (i.e. row4 and row5 in File1.xls; row7 and row8 in File2.xls).
- Add a column (or overwrite Column A) to indicate where the data from.
For example:

Is it possible? Thanks.
Accepted answer +8 upvotes
For num. 1, you can specify skip_footer as explained here; or, alternatively, do
data = data.iloc[:-2]
once your read the data.
For num. 2, you may do:
from os.path import basename
data.index = [basename(f)] * len(data)
Also, perhaps would be better to put all the data-frames in a list and then concat them at the end; something like:
df = []
for f in ['c:\file1.xls', 'c:\ file2.xls']:
data = pd.read_excel(f, 'Sheet1').iloc[:-2]
data.index = [os.path.basename(f)] * len(data)
df.append(data)
df = pd.concat(df)
3 code variants in this answer
- Variant 1 — 1 lines, starts with
data = data.iloc[:-2] - Variant 2 — 2 lines, starts with
from os.path import basename - Variant 3 — 7 lines, starts with
df = []
Top excel Q&A (6)
- Shortcut to Apply a Formula to an Entire Column in Excel +335 (2011)
- How should I escape commas and speech marks in CSV files so they work in Excel? +136 (2012)
- Convert xlsx to csv in linux command line +96 (2012)
- How to create a link inside a cell using EPPlus +50 (2011)
- IF statement: how to leave cell blank if condition is false ("" does not work) +44 (2013)
- T-SQL: Export to new Excel file +44 (2012)
excel solutions on this site
.