【Azure Storage Account】Azure 存储服务计算Blob的数量和大小的PowerShell代码 - LuBu0505/My-Code GitHub Wiki

问题描述

介绍一段Python脚本,可以在微软云中国区使用。

用于计算Azure Storage Account中Container中Blob类型文件的数量和大小,脚本中允许按照容器,层(热/冷/归档),前缀,软删除/非软删除来计算数量和容量大小, 默认使用的时间为以Blob的最后修改时间作为参考。

执行结果参考:

image.png

参数介绍

所有值都是强制性的,有些可以为空,参考如下的描述以及脚本中解释。

  • $storageAccountName - 只需运行脚本,系统就会询问存储帐户名称。
  • $containerName - 指定一些容器名称,或为空(默认值)以列出所有容器
  • $prefix - 指定一些用于扫描的 blob 前缀(不包括容器名称),或留空(默认值)以列出所有对象
  • $deleted - 指定“True”以仅列出软删除对象,“False”以仅列出非软删除对象(活动对象 - 默认值),或指定“All”以列出活动和软删除对象
  • $blobType - 选择“Base”仅列出基本 Blob(默认值),“Snapshots”仅列出快照,“Versions”仅列出版本,“Versions+Snapshots”仅列出版本和快照,或“所有类型”列出所有对象(基本 Blob、版本和快照)
  • $accessTier - 选择“Hot”仅列出“Hot”层中的对象,“Cool”仅列出“Cool”层中的对象,“Archive”仅列出存档层中的对象,或选择“All”以列出所有层中的对象(Hot、酷并存档)
  • $Year$Month$Day - 定义一个日期,仅列出上次修改日期之前或等于该日期的对象 - 如果至少有一个值为空,则将使用当前日期。

注意

  1. 此脚本不会支持统计 ADLS Gen2 帐户中的文件夹。
  2. 只需运行脚本就会要求提供 AAD 凭据并选择要列出的存储帐户名称。
  3. 默认情况下(不更改任何参数,脚本将列出存储帐户中所有容器上、所有访问层的所有基本 Blob,且上次修改日期早于或等于当前日期时间。
  4. 所有其他选项(上面)应在脚本中定义。
  5. 这可能需要数(小时/天)才能完成,具体取决于容器或存储帐户中的 blob、版本和快照的数量。
  6. $logs 容器中的内容不被统计(不支持)

权限

若要使用 AAD 列出 Blob,执行脚本的Azure账号(或AAD 应用)需要拥有“Storage Blob Data Reader (存储 Blob 数据读取者)”角色。 否则,会遇见权限错误“The client '[email protected]' with object id 'xx-x-x-x-xxx' does not have authorization to perform action 'Microsoft.Storage/storageAccounts/listKeys/action' over scope '/subscriptions/xxxxx' or the scope is invalid.”, 参考文档(

用于 Blob 的 Azure 内置角色:https://learn.microsoft.com/zh-cn/azure/storage/blobs/authorize-access-azure-active-directory#azure-built-in-roles-for-blobs

[图片上传失败...(image-a5e12-1716204585369)]

脚本全文

# ====================================================================================
# Azure Storage Blob calculator:
# Base Blobs, Blob Snapshots, Versions, Deleted / not Deleted, by Container, by tier, with prefix and considering Last Modified Date
# ====================================================================================
# This PowerShell script will count and calculate blob usage on each container, or in some specific container in the provided Storage account
# Filters can be used based on  
#     All containers or some specific Container
#     Base Blobs, Blob Snapshots, Versions, All
#     Hot, Cool, Archive or All Access Tiers
#     Deleted, Not Deleted or All
#     Filtered by prefix
#     Filtered by Last Modified Date
# This can take some hours to complete, depending of the amount of blobs, versions and snapshots in the container or Storage account.
# $logs container is not covered  by this script (not supported)
# By default, this script List All non Soft Deleted Base Blobs, in All Containers, with All Access Tiers
# ====================================================================================
# DISCLAMER : Please note that this script is to be considered as a sample and is provided as is with no warranties express or implied, even more considering this is about deleting data. 
# You can use or change this script at you own risk.
# ====================================================================================
# PLEASE NOTE :
# - This script does not recover folders on ADLS Gen2 accounts.
# - Just run the script and your AAD credentials and the storage account name to list will be asked.
# - All other values should be defined in the script, under 'Parameters - user defined' section.
# - Uncomment line 180 (line after # DEBUG) to get the full list of all selected objects 
# ====================================================================================
# For any question, please contact Luis Filipe (Msft)
# ====================================================================================
# Corrected:
#  - Null array exception for empty containers
#  - Added capacity unit "Bytes" in the output
#  - Added options to select Tenant and Subscription

# sign in
Write-Host "Logging in...";

## For globa azure
#Connect-AzAccount;

# For china azure
Connect-AzAccount -Environment AzureChinaCloud

$tenantId = Get-AzTenant | Select-Object Id, Name | Out-GridView -Title 'Select your Tenant' -PassThru  -ErrorAction Stop
$subscId = Get-AzSubscription -TenantId $tenantId.Id | Select-Object TenantId, Id, Name | Out-GridView -Title 'Select your Subscription' -PassThru  -ErrorAction Stop

$subscriptionId = $subscId.Id;
if(!$subscriptionId)
{
    Write-Host "----------------------------------";
    Write-Host "No subscription was selected.";
    Write-Host "Exiting...";
    Write-Host "----------------------------------";
    Write-Host " ";
    exit;
}

# select subscription
Write-Host "Selecting subscription '$subscriptionId'";
Set-AzContext -SubscriptionId $subscriptionId;
CLS

#----------------------------------------------------------------------
# Parameters - user defined
# 参数定义部分 
#----------------------------------------------------------------------
$selectedStorage = Get-AzStorageAccount  | Out-GridView -Title 'Select your Storage Account' -PassThru  -ErrorAction Stop
$resourceGroupName = $selectedStorage.ResourceGroupName
$storageAccountName = $selectedStorage.StorageAccountName

$containerName = ''             # Container Name, or empty to all containers
$prefix = ''                    # Set prefix for scanning (optional)
    
$deleted = 'False'              # valid values: 'True' / 'False' / 'All' 
$blobType = 'Base'              # valid values: 'Base' / 'Snapshots' / 'Versions' / 'Versions+Snapshots' / 'All Types'
$accessTier = 'Cool'             # valid values: 'Hot', 'Cool', 'Archive', 'All'

# Select blobs before Last Modified Date (optional) - if all three empty, current date will be used
$Year = ''
$Month = ''
$Day = ''
#----------------------------------------------------------------------
if($storageAccountName -eq $Null) { break }


#----------------------------------------------------------------------
# Date format
#----------------------------------------------------------------------
if ($Year -ne '' -and $Month -ne '' -and $Day -ne '')
{
    $maxdate = Get-Date -Year $Year -Month $Month -Day $Day -ErrorAction Stop
} else {
    $maxdate = Get-Date
}
#----------------------------------------------------------------------



#----------------------------------------------------------------------
# Format String Details in user friendy format
#----------------------------------------------------------------------
switch($blobType) 
{
    'Base'               {$strBlobType = 'Base Blobs'}
    'Snapshots'          {$strBlobType = 'Snapshots'}
    'Versions+Snapshots' {$strBlobType = 'Versions & Snapshots'}
    'Versions'           {$strBlobType = 'Blob Versions only'}
    'All Types'          {$strBlobType = 'All blobs (Base Blobs + Versions + Snapshots)'}
}
switch($deleted) 
{
    'True'               {$strDeleted = 'Only Deleted'}
    'False'              {$strDeleted = 'Active (not deleted)'}
    'All'                {$strDeleted = 'All (Active+Deleted)'}
}
if ($containerName -eq '') {$strContainerName = 'All Containers (except $logs)'} else {$strContainerName = $containerName}
#----------------------------------------------------------------------



#----------------------------------------------------------------------
# Show summary of the selected options
#----------------------------------------------------------------------
function ShowDetails ($storageAccountName, $strContainerName, $prefix, $strBlobType, $accessTier, $strDeleted, $maxdate)
{
    # CLS

    write-host " "
    write-host "-----------------------------------"
    write-host "Listing Storage usage per Container"
    write-host "-----------------------------------"

    write-host "Storage account: $storageAccountName"
    write-host "Container: $strContainerName"
    write-host "Prefix: '$prefix'"
    write-host "Blob Type: $strDeleted $strBlobType"
    write-host "Blob Tier: $accessTier"
    write-host "Last Modified Date before: $maxdate"
    write-host "-----------------------------------"
}
#----------------------------------------------------------------------



#----------------------------------------------------------------------
#  Filter and count blobs in some specific Container
#----------------------------------------------------------------------
function ContainerList ($containerName, $ctx, $prefix, $blobType, $accessTier, $deleted, $maxdate)
{

    $count = 0
    $capacity = 0

    $blob_Token = $Null
    $exception = $Null 

    write-host -NoNewline "Processing $containerName...   "

    do
    { 

        # all Blobs, Snapshots
        $listOfAllBlobs = Get-AzStorageBlob -Container $containerName -IncludeDeleted -IncludeVersion -Context $ctx  -ContinuationToken $blob_Token -Prefix $prefix -MaxCount 5000 -ErrorAction Stop
        if($listOfAllBlobs.Count -le 0) {
            write-host "No Objects found to list"
            break
        }
     
        #------------------------------------------
        # Filtering blobs by type
        #------------------------------------------
        switch($blobType) 
        {
            'Base'               {$listOfBlobs = $listOfAllBlobs | Where-Object { $_.IsLatestVersion -eq $true -or ($_.SnapshotTime -eq $null -and $_.VersionId -eq $null) } }   # Base Blobs - Base versions may have versionId
            'Snapshots'          {$listOfBlobs = $listOfAllBlobs | Where-Object { $_.SnapshotTime -ne $null } }                                                                  # Snapshots
            'Versions+Snapshots' {$listOfBlobs = $listOfAllBlobs | Where-Object { $_.IsLatestVersion -ne $true -and (($_.SnapshotTime -eq $null -and $_.VersionId -ne $null) -or $_.SnapshotTime -ne $null) } }  # Versions & Snapshotsk
            'Versions'           {$listOfBlobs = $listOfAllBlobs | Where-Object { $_.IsLatestVersion -ne $true -and $_.SnapshotTime -eq $null -and $_.VersionId -ne $null} }     # Versions only 
            'All Types'          {$listOfBlobs = $listOfAllBlobs } # All - Base Blobs + Versions + Snapshots
        }


        #------------------------------------------
        # filter by Deleted / not Deleted / all
        #------------------------------------------
        switch($deleted) 
        {
            'True'               {$listOfBlobs = $listOfBlobs | Where-Object { ($_.IsDeleted -eq $true)} }   # Deleted
            'False'              {$listOfBlobs = $listOfBlobs | Where-Object { ($_.IsDeleted -eq $false)} }  # Not Deleted
            # 'All'              # All Deleted + Not Deleted
        }
  
        # filter by Last Modified Date
        $listOfBlobs = $listOfBlobs | Where-Object { ($_.LastModified -le $maxdate)}   # <= Last Modified Date


        #Filter by Access Tier
        if($accessTier -ne 'All') 
           {$listOfBlobs = $listOfBlobs | Where-Object { ($_.accesstier -eq $accessTier)} }
        


        #------------------------------------------
        # Count and used Capacity
        # Count includes folder/subfolders on ADLS Gen2 Storage accounts
        #------------------------------------------
        foreach($blob in $listOfBlobs)
        {
            # DEBUG - Uncomment next line to have a full list of selected objects
            # write-host $blob.Name " Content-length:" $blob.Length " Access Tier:" $blob.accesstier " LastModified:" $blob.LastModified  " SnapshotTime:" $blob.SnapshotTime " URI:" $blob.ICloudBlob.Uri.AbsolutePath  " IslatestVersion:" $blob.IsLatestVersion  " Lease State:" $blob.ICloudBlob.Properties.LeaseState  " Version ID:" $blob.VersionID

            $count++
            $capacity = $capacity + $blob.Length
        }

        $blob_Token = $listOfAllBlobs[$listOfAllBlobs.Count -1].ContinuationToken;
        

    }while ($blob_Token -ne $Null)   

    write-host "  Count: $count    Capacity: $capacity Bytes"
    

    return $count, $capacity
}
#----------------------------------------------------------------------

$totalCount = 0
$totalCapacity = 0

# $ctx = New-AzStorageContext -StorageAccountName $storageAccountName -UseConnectedAccount -ErrorAction Stop
$ctx = (Get-AzStorageAccount -ResourceGroupName $resourceGroupName -StorageAccount $storageAccountName).Context

ShowDetails $storageAccountName $strContainerName $prefix $strBlobType $accessTier $strDeleted $maxdate


$arr = "Container", "Count", "Used capacity"   
$arr = $arr + "-------------", "-------------", "-------------"   


$container_Token = $Null


#----------------------------------------------------------------------
# Looping Containers
#----------------------------------------------------------------------
do {
    
    $containers = Get-AzStorageContainer -Context $Ctx -Name $containerName -ContinuationToken $container_Token -MaxCount 5000 -ErrorAction Stop
        
        
    if ($containers -ne $null)
    {
        $container_Token = $containers[$containers.Count - 1].ContinuationToken

        for ([int] $c = 0; $c -lt $containers.Count; $c++)
        {
            $container = $containers[$c].Name

            $count, $capacity, $exception =  ContainerList $container $ctx $prefix $blobType $accessTier $deleted $maxdate 
            $arr = $arr + ($container, $count, $capacity)

            $totalCount = $totalCount +$count
            $totalCapacity = $totalCapacity + $capacity
        }
    }

} while ($container_Token -ne $null)

write-host "-----------------------------------"
#----------------------------------------------------------------------


#----------------------------------------------------------------------
# Show details in user friendly format and Totals
#----------------------------------------------------------------------
for ($i=0; $i -lt 15; $i++) { write-host " " }
ShowDetails $storageAccountName $strContainerName $prefix $strBlobType $accessTier $strDeleted $maxdate
$arr | Format-Wide -Property {$_} -Column 3 -Force

write-host "-----------------------------------"
write-host "Total Count: $totalCount"
write-host "Total Capacity: $totalCapacity Bytes"
write-host "-----------------------------------"
#----------------------------------------------------------------------

效果展示

list storage account blob count and size.gif

参考文档

Azure Storage Blob Count & Capacity usage Calculator :https://techcommunity.microsoft.com/t5/azure-paas-blog/azure-storage-blob-count-amp-capacity-usage-calculator/ba-p/3516855

当在复杂的环境中面临问题,格物之道需:浊而静之徐清,安以动之徐生。 云中,恰是如此!