-
Notifications
You must be signed in to change notification settings - Fork 162
Description
Am I using the newest version of the library?
- I have made sure that I'm using the latest version of the library.
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
Java app here using the Spark Excel library to read an Excel file into a Dataset<Row>. When I use the following configurations:
String filePath = "file:///Users/myuser/example-data.xlsx";
Dataset<Row> dataset = spark.read()
.format("com.crealytics.spark.excel")
.option("header", "true")
.option("inferSchema", "true")
.option("dataAddress", "'ExampleData'!A2:D7")
.load(filePath);
This works beautifully and my Dataset<Row> is instantiated without any issues whatsoever. But the minute I go to just tell it to read any rows between A through D, it reads an empty Dataset<Row>:
// dataset will be empty
.option("dataAddress", "'ExampleData'!A:D")
This also happens if I set the sheetName and dataAddress separately:
// dataset will be empty
.option("sheetName", "ExampleData")
.option("dataAddress", "A:D")
And it also happens when, instead of providing the sheetName, I provide a sheetIndex:
// dataset will be empty; and I have experimented by setting it to 0 as well
// in case it is a 0-based index
.option("sheetIndex", 1)
.option("dataAddress", "A:D")
My question: is this expected behavior of the Spark Excel library, or is it a bug I have discovered, or am I not using the Options API correctly here?
Expected Behavior
Explained above, I would have expected all three option configurations to work, but only the first one does.
Steps To Reproduce
Code is provided above. I am pulling in the following Gradle libraries:
implementation("org.apache.spark:spark-core_2.12:3.5.3")
implementation("org.apache.spark:spark-sql_2.12:3.5.3")
implementation("com.crealytics:spark-excel_2.12:3.5.1_0.20.4")
implementation("com.databricks:spark-xml_2.12:0.18.0")
I am using a Java application (not Scala).
Environment
- Spark version: `2.12:3.5.3`
- Spark-Excel version: `2.12:3.5.1_0.20.4`
- OS: Mac Sequoia 15.3
- Cluster environmentAnything else?
No response