What is the goal you are trying to achieve?
I am trying to get dataset A from a database, compare it with the latest data that is dataset B and only at the end of the pipeline i’d like to set dataset A to dataset B.
What have you tried, in order to accomplish the goal?
def create_pipeline(**kwargs): node_consumer = node( lambda x: x, ["data_redshift"], "data_csv", name="data_consumer" ) node_get_diff = node( get_diff, ["data_csv", "csv_latest"], "data_to_sync", name="get_diff" ) node_sync = node(sync, ["data_to_sync"], None, name="data_sync") node_update_latest = node(lambda x: x, ["data_csv"], "csv_latest", name="update_latest") return Pipeline([node_consumer, node_get_diff, node_sync, node_update_latest])
But when I try to run this I receive a circular dependency error cause csv_latest is used in node_consumer as input and node_update_latest as output. I understood it. But what should be a correct way to do that with kedro?
What version of Kedro are you using? (Use
Do you have any custom plugins?