How do I access each dataset’s dataset_fpath attribute?

In order to get the best help, please answer the following questions:

What is the goal you are trying to achieve?

1 Like

@facepalm Perhaps the easiest way to do this is in the after_catalog_created hook or in the register_catalog hook.

In both of these hooks, you get access to the catalog object. If you’re just trying to create directories, you can step through all the catalog dataset objects, check if they have the _filepath attribute, and then make a directory inside of that hook.

If you only want to make directories when that particular dataset is called, then you will have to create a class that wraps the original dataset, and overrides the save function to grab the self._filepath and do the directory creation. This is subtly different than hooks because in hooks, you don’t get access to the object’s attributes.


If it helps to see a concrete example, steel_toes is a library that reaches into your catalog and makes small tweaks to your filepath. I think your use case would be quite simple comparatively.

I like the idea of making sure that the directories exist, but why not include them in your repo and include .gitkeep files in any empty directory? One concern I might have if a project had this hook I would end up with a crazy directory including a teammates User folder. I see people use hard paths far too often __file__ seems to be lesser known.

For example :stuck_out_tongue_winking_eye: C:/Users/teammate/Dropbox/work_projects/sales/car_sales/car_sales/data/a_raw/sales/cars/


@waylonwalker luckily, we always use relative paths in our catalog :slight_smile:

If anyone is interested in this hook implementation, I posted the snippet here.

1 Like

Nice! Now I recognize you from Twitter.