What is ACID function and how it was impacting into Data lake storage environments? –Part6
Why I have written this series? I was worked 6 projects in hadoop(hortonworks/Clouera) and AWS technology. As per data ingestion layer used for HDFS & S3 storage but it will not support following feature like ACID, incremental data loading, data duplicate etc .. so we were used HBase, dynamodb and some scripts to achieve those functionality and it is good amount development effect involved with some bug. We were faced server performance issue in Hbase databases (good for random Read/Write Operations) like row key, delete etc.. and need provide separate infrastructure those framework including maintains. As per my suggestion try to use Apache Hudi/Delta Lake in your project and there is any heavy read operation consider Apache iceberg.