I’m just getting started with Kedro, so please excuse the question if it’s been solved or if this is the wrong tool.
I’m working at a company that stores most data in s3. We have quite a few tables in AWS Athena reading from the s3 buckets and I’m wondering if I can use Kedro to query Athena directly.
My typical workflow looks like this:
- Write a template query to pull sample data sets from Athena.
- Load the template query in a python script and query Athena using pyathena, save the output to disk.
- Take the output into other notebooks for modeling/visuals.
I understand Kedro can help organize the process once the data is on disk, but I’d love to have a way to handle the querying process as well. So, can Kedro query Athena?