Azure S3 (Object Storage) Support Development - cloud-barista/cb-spider GitHub Wiki

Azure S3 (Object Storage) Support Development

1. ๊ฐœ์š”

CB-Spider์˜ S3 Manager์— Azure Blob Storage ์ง€์›์„ ์ถ”๊ฐ€ํ•˜์˜€๋‹ค.
๊ธฐ์กด AWS, GCP, IBM ๋“ฑ์€ S3 ํ˜ธํ™˜ API(minio-go)๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ, Azure๋Š” S3 ํ˜ธํ™˜ API๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ Azure Go SDK(azblob)๋ฅผ ์ด์šฉํ•œ Native ๊ตฌํ˜„ ๋ฐฉ์‹์„ ์ฑ„ํƒํ•˜์˜€๋‹ค.

์„ค๊ณ„ ์›์น™

  • GCP Native SDK ์ฒ˜๋ฆฌ ํŒจํ„ด(Versioning/CORS/Force ๋ถ„๊ธฐ)์„ ํ™•์žฅํ•˜์—ฌ ์ „ ๊ธฐ๋Šฅ Azure Native ๋ถ„๊ธฐ
  • ๊ธฐ์กด minio-go ๊ธฐ๋ฐ˜ ์ฝ”๋“œ๋ฅผ ๋ณ€๊ฒฝํ•˜์ง€ ์•Š๊ณ , connInfo.ProviderName == "AZURE" ๋ถ„๊ธฐ๋กœ Azure ์ „์šฉ ํ•จ์ˆ˜ ํ˜ธ์ถœ
  • Azure ์ „์šฉ ๋กœ์ง์€ ๋ณ„๋„ ํŒŒ์ผ(S3Manager_azure.go)์— ๋ถ„๋ฆฌ
  • CB-Spider๊ฐ€ ์ œ๊ณตํ•˜๋Š” 3๊ฐ€์ง€ API ๊ทœ๊ฒฉ(CB Standard JSON, AWS-compatible XML with Basic Auth, AWS-compatible XML with SigV4) ๋ชจ๋‘ ๋™์ผํ•˜๊ฒŒ ์ง€์›

2. Azure โ†” S3 ๊ฐœ๋… ๋งคํ•‘

S3 ๊ฐœ๋… Azure ๊ตฌํ˜„ ๋น„๊ณ 
Bucket Container ์ด๋ฆ„ ๊ทœ์น™: ์†Œ๋ฌธ์ž, 3-63์ž, ํ•˜์ดํ”ˆ ํ—ˆ์šฉ
Object Blob Block Blob ๊ธฐ๋ฐ˜
Multipart Upload Block Blob (StageBlock/CommitBlockList) Upload ID๋Š” xid๋กœ ์ž์ฒด ์ƒ์„ฑ
Presigned URL SAS (Shared Access Signature) Token URL sas.BlobSignatureValues ๊ธฐ๋ฐ˜
Versioning Storage Account ๋ ˆ๋ฒจ (ARM API) ๊ฐœ๋ณ„ Bucket ๋‹จ์œ„ ์„ค์ • ๋ถˆ๊ฐ€
CORS Storage Account ๋ ˆ๋ฒจ (Service Properties) ๊ฐœ๋ณ„ Bucket ๋‹จ์œ„ ์„ค์ • ๋ถˆ๊ฐ€
Delete Marker ๋ฏธ์ง€์› Azure๋Š” Soft Delete ์‚ฌ์šฉ
AccessKey / SecretKey StorageAccountName / StorageAccountKey

3. ์ธ์ฆ ์ •๋ณด ๋งคํ•‘

Data Plane (Blob ์กฐ์ž‘)

S3ConnectionInfo ํ•„๋“œ Azure Credential ์†Œ์Šค ๊ฐ’ ์˜ˆ์‹œ
AccessKey S3AccessKey ๋˜๋Š” StorageAccountName mystorageaccount
SecretKey S3SecretKey ๋˜๋Š” StorageAccountKey base64encodedkey==
Endpoint ์ž๋™ ์ƒ์„ฑ mystorageaccount.blob.core.windows.net
UseSSL ํ•ญ์ƒ true HTTPS ๊ฐ•์ œ
Region Connection Config์—์„œ ์ „๋‹ฌ ํ•„์ˆ˜ ์•„๋‹˜

Management Plane (Versioning ๋“ฑ ARM API)

ARM ์ž‘์—…(Versioning Enable/Suspend/Get)์—๋Š” Service Principal ์ธ์ฆ์ด ํ•„์š”ํ•˜๋ฉฐ, ๋™์ผ Connection Config์˜ Credential์—์„œ ๊ฐ€์ ธ์˜จ๋‹ค:

ํ•„๋“œ Credential Key
SubscriptionId SubscriptionId
TenantId TenantId
ClientId ClientId
ClientSecret ClientSecret
StorageAccountName S3AccessKey ๋˜๋Š” StorageAccountName

4. ๋ณ€๊ฒฝ ํŒŒ์ผ ๋ชฉ๋ก

ํŒŒ์ผ ๋ณ€๊ฒฝ ์œ ํ˜• ์„ค๋ช…
api-runtime/common-runtime/S3Manager_azure.go ์‹ ๊ทœ Azure ์ „์šฉ S3 ๊ธฐ๋Šฅ ๊ตฌํ˜„ (37๊ฐœ ํ•จ์ˆ˜, ~1,350์ค„)
api-runtime/common-runtime/S3Manager.go ์ˆ˜์ • 32๊ฐœ ํ•จ์ˆ˜์— AZURE ๋ถ„๊ธฐ ์ถ”๊ฐ€ + GetS3ConnectionInfo์— AZURE case ์ถ”๊ฐ€
go.mod ์ˆ˜์ • github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.5.0 ์˜์กด์„ฑ ์ถ”๊ฐ€
go.sum ์ˆ˜์ • ์ž๋™ ๊ฐฑ์‹ 

5. ๊ตฌํ˜„ ํ•จ์ˆ˜ ๋ชฉ๋ก (S3Manager_azure.go)

Client Helper

ํ•จ์ˆ˜ ์„ค๋ช…
newAzureBlobClient SharedKeyCredential ๊ธฐ๋ฐ˜ Azure Blob ํด๋ผ์ด์–ธํŠธ ์ƒ์„ฑ
getAzureManagementInfo Connection Config์—์„œ ARM์šฉ SP ์ธ์ฆ ์ •๋ณด ์ถ”์ถœ
getAzureResourceGroup Storage Account๊ฐ€ ์†ํ•œ Resource Group ์ž๋™ ํƒ์ƒ‰
azureBlockID Multipart์šฉ Base64 Block ID ์ƒ์„ฑ (uploadID prefix + part number)
stripAzureETagQuotes Azure ETag์—์„œ ๋”ฐ์˜ดํ‘œ ์ œ๊ฑฐ

Bucket (Container) ์ž‘์—…

ํ•จ์ˆ˜ ์„ค๋ช…
createAzureBucket Container ์ƒ์„ฑ
listAzureBuckets Container ๋ชฉ๋ก ์กฐํšŒ โ†’ []*minio.BucketInfo
listAzureBucketsWithIID Container ๋ชฉ๋ก + IID ์ •๋ณด ์กฐํšŒ
getAzureBucket ๋‹จ์ผ Container ์ •๋ณด ์กฐํšŒ
deleteAzureBucket Container ์‚ญ์ œ

Object (Blob) ์ž‘์—…

ํ•จ์ˆ˜ ์„ค๋ช…
listAzureObjects Container ๋‚ด Blob ๋ชฉ๋ก ์กฐํšŒ (prefix ํ•„ํ„ฐ)
azureBlobItemToObjectInfo Azure BlobItem โ†’ minio.ObjectInfo ๋ณ€ํ™˜
getAzureObjectInfo Blob ์†์„ฑ ์กฐํšŒ (ํฌ๊ธฐ, ETag, ContentType, VersionID)
getAzureObjectInfoWithVersion ํŠน์ • ๋ฒ„์ „ Blob ์†์„ฑ ์กฐํšŒ
deleteAzureObject Blob ์‚ญ์ œ
deleteAzureObjectVersion ํŠน์ • ๋ฒ„์ „ Blob ์‚ญ์ œ
deleteMultipleAzureObjects ๋‹ค์ˆ˜ Blob ์ˆœ์ฐจ ์‚ญ์ œ
deleteMultipleAzureObjectVersions ๋‹ค์ˆ˜ ๋ฒ„์ „ Blob ์ˆœ์ฐจ ์‚ญ์ œ

Stream ์ž‘์—…

ํ•จ์ˆ˜ ์„ค๋ช…
getAzureObjectStream Blob ๋‹ค์šด๋กœ๋“œ โ†’ io.ReadCloser
getAzureObjectStreamWithVersion ํŠน์ • ๋ฒ„์ „ Blob ๋‹ค์šด๋กœ๋“œ
putAzureObject Blob ์—…๋กœ๋“œ (UploadStream)
getAzureBucketTotalSize Container ์ „์ฒด ํฌ๊ธฐ ๊ณ„์‚ฐ

Multipart (Block Blob) ์ž‘์—…

ํ•จ์ˆ˜ ์„ค๋ช…
initiateAzureMultipartUpload Upload ID ์ƒ์„ฑ (xid ๊ธฐ๋ฐ˜, Azure๋Š” ๋ช…์‹œ์  initiation ๋ถˆํ•„์š”)
uploadAzurePart Block ์Šคํ…Œ์ด์ง• (StageBlock)
completeAzureMultipartUpload Block List ์ปค๋ฐ‹ (CommitBlockList)
abortAzureMultipartUpload No-op (Azure๋Š” 7์ผ ํ›„ ์ž๋™ ์ •๋ฆฌ)
listAzureParts Uncommitted Block ๋ชฉ๋ก ์กฐํšŒ (GetBlockList)

Versioning (ARM Management Plane)

ํ•จ์ˆ˜ ์„ค๋ช…
enableAzureVersioning Storage Account ๋ ˆ๋ฒจ Versioning ํ™œ์„ฑํ™”
suspendAzureVersioning Storage Account ๋ ˆ๋ฒจ Versioning ๋น„ํ™œ์„ฑํ™”
getAzureVersioning Versioning ์ƒํƒœ ์กฐํšŒ
listAzureObjectVersions Container ๋‚ด Blob ๋ฒ„์ „ ๋ชฉ๋ก ์กฐํšŒ

CORS (Service Properties)

ํ•จ์ˆ˜ ์„ค๋ช…
setAzureBucketCORS CORS ๊ทœ์น™ ์„ค์ • (Storage Account ๋ ˆ๋ฒจ)
getAzureBucketCORS CORS ๊ทœ์น™ ์กฐํšŒ โ†’ *cors.Config
deleteAzureBucketCORS CORS ๊ทœ์น™ ์‚ญ์ œ (๋นˆ ๊ทœ์น™ ์„ค์ •)

Presigned URL (SAS)

ํ•จ์ˆ˜ ์„ค๋ช…
getAzurePresignedURL GET(Read) ๋˜๋Š” PUT(Write/Create) SAS URL ์ƒ์„ฑ

Force ์ž‘์—…

ํ•จ์ˆ˜ ์„ค๋ช…
forceEmptyAzureBucket ๋ชจ๋“  Blob(๋ฒ„์ „ ํฌํ•จ) ๊ฐ•์ œ ์‚ญ์ œ
forceEmptyAndDeleteAzureBucket ๊ฐ•์ œ ๋น„์šฐ๊ธฐ + Container ์‚ญ์ œ + DB ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ์ •๋ฆฌ

6. ์‚ฌ์šฉ๋œ Azure SDK ๋ชจ๋“ˆ

๋ชจ๋“ˆ ๋ฒ„์ „ ์šฉ๋„
azure-sdk-for-go/sdk/storage/azblob v1.5.0 Blob Storage ๋ฐ์ดํ„ฐ ํ”Œ๋ ˆ์ธ (์‹ ๊ทœ ์ถ”๊ฐ€)
azure-sdk-for-go/sdk/resourcemanager/storage/armstorage v1.8.0 Storage Account ARM ๊ด€๋ฆฌ (๊ธฐ์กด)
azure-sdk-for-go/sdk/azcore v1.18.0 Core SDK, to.Ptr(), streaming.NopCloser (๊ธฐ์กด)
azure-sdk-for-go/sdk/azidentity v1.10.1 Service Principal ์ธ์ฆ (๊ธฐ์กด)

7. CSP๋ณ„ S3 ๊ธฐ๋Šฅ ์ง€์› ํ˜„ํ™ฉ ๋น„๊ตํ‘œ

๊ธฐ๋Šฅ AWS GCP Azure IBM Alibaba Tencent OpenStack NHN NCP KT
Bucket CRUD โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Object CRUD โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Object Stream โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Multipart Upload โœ… โœ… โŒ โœ… โœ… โœ… โŒ โœ… โœ… โœ…
ListMultipartUploads โœ… โœ… โŒ โœ… โœ… โœ… โŒ โœ… โœ… โœ…
Presigned URL โœ… โœ… โœ…(SAS) โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Versioning โœ… โœ… โŒ โœ… โœ… โœ… โŒ โŒ โŒ โœ…
CORS โœ… โœ… โŒ โœ… โœ… โœ… โœ… โŒ โŒ โœ…
Delete Marker โœ… โœ… โŒ โœ… โœ… โœ… โŒ โŒ โŒ โœ…
Delete Multiple โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Force Empty โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…
Force Empty+Delete โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ… โœ…

โŒ = ๋ฏธ์ง€์› (ํ•ด๋‹น CSP ๊ตฌ์กฐ์  ์ œ์•ฝ)


8. Azure์—์„œ ์ œ๊ณตํ•  ์ˆ˜ ์—†๋Š” ๊ธฐ๋Šฅ (๋น„์ง€์› ์‚ฌํ•ญ)

8.1 Delete Markers (DeleteS3ObjectDeleteMarker)

  • ์‚ฌ์œ : Azure Blob Storage๋Š” S3 ์Šคํƒ€์ผ Delete Marker๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค.
  • Azure ๋ฐฉ์‹: Soft Delete๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, ์‚ญ์ œ๋œ blob์€ ์„ค์ •๋œ ๋ณด์กด ๊ธฐ๊ฐ„(retention period) ์ดํ›„ ์ž๋™ ์ œ๊ฑฐ๋œ๋‹ค.
  • ์ฒ˜๋ฆฌ: "delete markers are not supported by Azure Blob Storage" ์—๋Ÿฌ ๋ฐ˜ํ™˜ (HTTP 501)

8.2 Multipart Upload (์ „์ฒด)

  • ์‚ฌ์œ : Azure Block Blob์€ uncommitted blocks๋ฅผ 7์ผ ํ›„ ์ž๋™ ์ •๋ฆฌํ•˜๋ฉฐ, ์ง„ํ–‰ ์ค‘์ธ multipart upload ๋ชฉ๋ก์„ ์กฐํšŒํ•˜๋Š” API(ListMultipartUploads)๊ฐ€ ์—†๋‹ค. ๋ชฉ๋ก ์กฐํšŒ๊ฐ€ ๋ถˆ๊ฐ€๋Šฅํ•˜๋ฉด ์—…๋กœ๋“œ ์ƒํƒœ ๊ด€๋ฆฌ๊ฐ€ ๋ถˆ์™„์ „ํ•˜๋ฏ€๋กœ Multipart Upload ์ „์ฒด๋ฅผ ๋ฏธ์ง€์›์œผ๋กœ ์ฒ˜๋ฆฌํ•œ๋‹ค.
  • ์˜ํ–ฅ ํ•จ์ˆ˜: InitiateMultipartUpload, UploadPart, CompleteMultipartUpload, AbortMultipartUpload, ListParts, ListMultipartUploads
  • ์ฒ˜๋ฆฌ: "multipart upload is not supported by ..." ์—๋Ÿฌ ๋ฐ˜ํ™˜ (HTTP 501)
  • ๋Œ€์•ˆ: ๋‹จ์ผ ์—…๋กœ๋“œ(PUT /bucket/object)๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

8.3 Versioning (EnableVersioning, SuspendVersioning, GetVersioning, ListObjectVersions, DeleteObjectVersion)

  • ์‚ฌ์œ : Azure Blob Storage์˜ Versioning์€ Storage Account ์ „์ฒด์— ์ ์šฉ๋˜๋ฉฐ, ๊ฐœ๋ณ„ Container(Bucket) ๋‹จ์œ„๋กœ ์ œ์–ดํ•  ์ˆ˜ ์—†๋‹ค. S3 API๋Š” Bucket ๋‹จ์œ„ Versioning์„ ์ „์ œ๋กœ ํ•˜๋ฏ€๋กœ ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š”๋‹ค.
  • ์ฒ˜๋ฆฌ: "bucket versioning is not supported by ..." ์—๋Ÿฌ ๋ฐ˜ํ™˜ (HTTP 501)

8.4 CORS (SetBucketCORS, GetBucketCORS, DeleteBucketCORS)

  • ์‚ฌ์œ : Azure์˜ CORS ์„ค์ •์€ Storage Account ์ „์ฒด์— ์ ์šฉ๋˜๋ฉฐ, ๊ฐœ๋ณ„ Container(Bucket) ๋‹จ์œ„๋กœ ๊ฒฉ๋ฆฌํ•  ์ˆ˜ ์—†๋‹ค. S3 API๋Š” Bucket ๋‹จ์œ„ CORS๋ฅผ ์ „์ œ๋กœ ํ•˜๋ฏ€๋กœ ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š”๋‹ค.
  • ์ฒ˜๋ฆฌ: "CORS configuration is not supported by ..." ์—๋Ÿฌ ๋ฐ˜ํ™˜ (HTTP 501)

9. Presigned URL (SAS Token) ๊ตฌํ˜„

Azure์˜ SAS(Shared Access Signature)๋ฅผ ์ด์šฉํ•˜์—ฌ S3 Presigned URL๊ณผ ๋™์ผํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•œ๋‹ค:

HTTP Method SAS Permission ์„ค๋ช…
GET Read Blob ๋‹ค์šด๋กœ๋“œ
PUT Write + Create Blob ์—…๋กœ๋“œ

์ƒ์„ฑ๋œ URL ํ˜•์‹:

https://{storageAccount}.blob.core.windows.net/{container}/{blob}?{sasToken}

11. ์‚ฌ์ „ ์š”๊ตฌ์‚ฌํ•ญ

Azure Credential ์„ค์ •

CB-Spider์—์„œ Azure S3๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋‹ค์Œ Credential์ด ํ•„์š”ํ•˜๋‹ค:

{
  "CredentialName": "azure-s3-credential",
  "ProviderName": "AZURE",
  "KeyValueInfoList": [
    {"Key": "SubscriptionId", "Value": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"},
    {"Key": "TenantId", "Value": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"},
    {"Key": "ClientId", "Value": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"},
    {"Key": "ClientSecret", "Value": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"},
    {"Key": "StorageAccountName", "Value": "mystorageaccount"},
    {"Key": "StorageAccountKey", "Value": "base64encodedkey=="}
  ]
}

StorageAccountName/StorageAccountKey ๋Œ€์‹  S3AccessKey/S3SecretKey ํ‚ค ์ด๋ฆ„๋„ ์‚ฌ์šฉ ๊ฐ€๋Šฅ

ํ•„์š” Azure ๊ถŒํ•œ

์ž‘์—… ์œ ํ˜• ํ•„์š” ๊ถŒํ•œ
Blob ๋ฐ์ดํ„ฐ ์กฐ์ž‘ (CRUD, Stream, Multipart) Storage Account Key (SharedKey ์ธ์ฆ)
Versioning Enable/Suspend/Get Service Principal + Storage Account Contributor ์ด์ƒ
Resource Group ํƒ์ƒ‰ Service Principal + Reader ์ด์ƒ (๊ตฌ๋… ๋ฒ”์œ„)