Examples: Handle Empty Column Names - nightscape/spark-excel GitHub Wiki
Purpose: An simple example with excel file, that data does not started from the first row, and some column names are missing
Credit: to @datadu-de for sharing this example
Missing column names excel file
You can follow the linked issue #366 or access the sample file from: df_empty_headers_not_working.xlsx
For this excel file, there are some columns without header and needed data is not from the first row, as we can see in this screenshot:
To load this excel file, just need to set the dataAddress from cell A8, Spark Excel will generate unique names for missing column headers
df = spark.read.format("excel") \
.option("header", True) \
.option("inferSchema", False) \
.option("dataAddress", "A8") \
.load(f"/<change to your local path>/df_empty_headers_not_working.xlsx")
Now, we can check our laded data frame and its schema
df.show(10)
df.printSchema()
df.count()