Reference: (3) Azure Data Engineering: Azure Blob Storage vs. Azure Data Lake Storage Gen2 | LinkedIn
Quick Notes from same document:
ADLS Gen2 truly is the result of converging the capabilities of two storage services, Azure Blob Storage and Azure Data Lake Storage Gen1. The result? You get the best of both worlds. File system semantics, directory and file-level security capabilities from ADLS Gen1 are combined with the low-cost, tiered storage, high availability/disaster recovery capabilities from Azure Blob Storage.
ADLS Gen2 was designed with big data analytics in mind and is a key component in modern data analytics, data science, and data warehousing architectures. A fundamental component of ADLS Gen2 is the addition of a hierarchical namespace to Blob storage. To explain what this term really means, think about the file explorer on your computer. You likely have created (or at least attempted to create) an organized folder structure. Unlike Blob storage, you have the ability to create a folder structure with a hierarchy in your ADLS Gen2 account. Besides providing a familiar interface style for developers, the hierarchical namespace is preferred when working with big data analytics frameworks like Hive and Spark. Without real directories, applications must process potentially millions of individual blobs to accomplish directory-level tasks, whereas the hierarchical namespace processes these tasks by updating the parent directory. Spark jobs, for example, often write output to temporary locations and rename the location at the end of the job. The time to rename is significantly lower with a hierarchical namespace.
if you try searching for “Azure Data Lake Storage Gen2” in the Azure portal, you will not find what you’re looking for! ADLS Gen2 accounts are provisioned by configuring the “enable hierarchical namespace” option in the creation process of an Azure Storage Account. Once you provision a storage account, you cannot modify the hierarchical namespace configuration
Azure Data Lake Store Gen2 is a superset of Azure Blob storage capabilities. In the list below, some of the key differences between ADLS Gen2 and Blob storage are summarized.
ADLS Gen2 supports ACL and POSIX permissions allowing for more granular access control compared to Blob storage.
ADLS Gen2 introduces a hierarchical namespace. This is a true file system, unlike Blob Storage which has a flat namespace. This capability has a significant impact on performance, especially in big data analytics scenarios.
ADLS Gen2 is an HDFS-compatible store. This means that Apache Hadoop services can use data stored in ADLS Gen2. Azure Blob storage is not Hadoop-compatible.
If you are storing vhd files or have a workload that would not benefit from file systems hierarchy, then ADLS Gen2 may not be the right choice.