taking a batch process and migrating it to a streaming service not pyspark data size increased dramatically over 1 year period changing manual process to automated swamped current automated process. zzzzz