Spark error "Too large frame" when loading SparkSession

I have been periodically running kedro pipelines that I started with the pyspark-iris starter. Everything has been going fine until recently when all of a sudden I am getting the spark error: java.lang.IllegalArgumentException: Too large frame: 5785721462337832960 originating from py4j. This happens even when I just use the default pipeline with iris.csv.

When I create a new kedro project (0.17.3) and try to run the example pipeline with iris.csv everything works fine, so I don’t think this has to do with my environment. Based on my research some people get this error when their pyspark version doesn’t match the version of their spark cluster, but I am running this all locally for now.

I thought I might genuinely have a spark partition that is too large, but how could I even tell which catalog item is causing this? I tried temporarily swapping out my catalog for en empty one that only includes iris.csv and I still get the error.

kedro, version 0.17.3

Custom plugins? No

Thank you for your help on this!!

I have the same problem: “Too large frame” error when just loading SparkSession! Did you solve it?