How to store pickle files in kedro using the catalog?

In one of my pipelines, I build a glm model and I want to be able to save the model for later use. Before using kedro, I would pickle the file and save it. I can do the same thing in kedro where I pickle it within the node and specify the file path.

However, I don’t want to specify the file path in the node because I want to keep all the file paths and data storage information within the catalog.

I’ve tried to point to the catalog’s file path to save the pickle file but it doesn’t seem to work.

Does anyone have any ideas on how to do this? Is it possible to save a pickle file without specifying a file path?

I might be misunderstanding, but this should be as easy as using the pickle.PickleDataSet in your catalog. So if your catalog looks like this:

my_output:
  type: pickle.PickleDataSet
  filepath: data/06_models/my_model.pkl

and your node looks like this:

def my_func(input):
  return python_object_like_model

node(
  my_func, input="dataset", output="my_output"
)

then Kedro will save the output of my_func as a pickle file in the filepath you specified in your catalog. You can look at the spaceflights tutorial in the documentation for an example of this. Specifically this part.

1 Like