Migrating Sitecore medialib items from an Azure Storage Account to another one

Cet article en francais

One of our customer was upgrading his multisite/multilanguage Sitecore installation from version 9.0.1 to 9.3, and had gigabytes of media to move from the old site to the new one.

Of course, moving the Sitecore items was easy, it was just to create a package and import it to the new site – I’ll show you how to do that here – but the real challenge was to import the files under the media library. The old files were inside an Azure Storage Account, and we needed to attach the correct files to the new Sitecore items, so that they get a reference to the new Storage Account.

Another challenge was to identify the Sitecore items that refers directly to the files in the old Azure Storage Account, as one can do it in for example a Rich Text Editor.

Let’s imagine you have this old site S1, with your content items in /sitecore/content/Sites and your media items in /sitecore/media library/Sites. The media items are stored in an Azure Storage Account (S1.blob.core.windows.net).

We’ll create a package for the Sitecore content items, another one for Sitecore media library items. Then we’ll get the media files from Azure. Using a Powershell script, we’ll import the media files to the new Azure Storage Account and refer them to the Sitecore items in the new site.

Step 1: Create a package

To create a package, go to Sitecore Desktop and choose Start/Development Tools/Package Designer

I recommend you create two separate packages, one for your content items (under /sitecore/Content) and one for your media library items (under /sitecore/media library).

In the Package Designer application, add items statically

When you’re finish adding items to this package, enter a good source name (lite content Site 1). You’re back to the main window for the Package Designe application. Enter a Package Name and an Author, as this information will be presented when you install your package.

Then click on the button “Generate ZIP” to create your zip file. Don’t forget to download the package on the last step !

Do the same steps to create a package for the media library items under /sitecore/media library/Sites.

The media package actually includes binaries (blobs) for all the media files. We don’t need them in this form, as we are going to migrate those files programmatically from an Azure Storage to another one. Using a zip tool (I use 7-zip), open the package and remove the folder blob. The new media package will also be much smaller, and thus faster to install.

Step 2 – Import the media files from Azure Storage Account

Log in to your Azure account and browse to the Azure Storage Account. In order to have the permissions to import the files, you’ll need to create a Shared Access Signature.

In Azure portal, identify and open your Storage Account. In the left handside menu, click on “Shared access signature”

Choose what kind of resource types you will be allowed to download (i use to check everything) and generate your SAS string.

Save the value of the field “SAS token”, this is what you’ll need later

I am going to use the utility azcopy. If you haven’t installed it already, do it https://aka.ms/downloadazcopy-v10-windows

Open a Powershell console as an administrator.

First you’ll need to login to your Azure account from the Powershell console. Type the command “az login” and follow the instructions to log in.

The instruction to get all media files from the Azure Storage Account is

azcopy copy 'https://{STORAGE_ACCOUNT_SERVER}/medialibrary/Sites/{SAS_KEY}' '{LOCAL_PATH}' --recursive

All the files from the Storage Account are now transferred to your local folder, keeping the folder hierarchy.

Step 3 – Upload the media files to Sitecore CM server

It is a good recommandation to always use the same folder inside the Sitecore website when you have files to transfer. I always use the $SitecoreDataFolder/upload (aka /App_Data/upload) folder to move files.

If you have a on-Premises installation then it is very easy to transfer your new media files directly to the upload folder with FTP or a shared folder.

If your CM server is an Azure App Service, you’ll need to get the CM server’s Publish Settings to have the FTP login credentials. Using Azure Portal, surf to the page with all the informations about your CM server. Stay on the Overview page and click on the button “Get publish profile”.

The Publish Profile is an XML file containing credential information to connect to this server with FTP.

The profile name “SERVERNAME – FTP” is the one you need to look at. The attribute “publishUrl” is the URL to connect, “userName” is the userame, and “userPWD” is the password. Use this information to put all the media files to the server’s /App_Data/upload folder.

Step 4 – Install Sitecore Items

Install the newly created packages in the CM server, using the install package wizard

Step 5 – Run SPE script

It is now time to attach the files to the Sitecore items, and tranfer those files to Azure Storage. To connect the Sitecore items and the files, my script is looking for a sitecore items with the same name as the file, and in a corresponding path. If you have another structure in your files and in your SItecore items inside media library, you’ll need to change the search command in the Find-MediaItem function:

[Sitecore.Data.Items.MediaItem]$item = Get-Item master: -Query $query | Where-Object { $_.TemplateName -eq $TemplateName } | Select -First 1

This is the whole script for the transfer of all the files.

function Update-MediaItem {
    [CmdletBinding()]
    param(
        [Parameter(Position=0, Mandatory=$true, ValueFromPipeline=$true)]
        [ValidateNotNullOrEmpty()]
        [string]$filePath,

        [Parameter(Position=1, Mandatory=$true)]
        [ValidateNotNullOrEmpty()]
        [string]$mediaPath)
		
	$itemMedia = Get-Item -Path "master:$mediaPath"
	$itemMedia.Editing.BeginEdit()
	$itemMedia.Fields['File Path'].Value = ""
	if ($itemMedia.Fields['Alt']) {
		$itemMedia.Fields['Alt'].Value = $itemMedia.Name
	}
	$itemMedia.Editing.EndEdit()

    [Sitecore.Data.Items.MediaItem]$item = gi -Path "master:$mediaPath"
	[Sitecore.Resources.Media.Media] $media = [Sitecore.Resources.Media.MediaManager]::GetMedia($item);
    $extension = [System.IO.Path]::GetExtension($filePath);
	$extension = $extension.Substring(1);
	$stream = New-Object -TypeName System.IO.FileStream -ArgumentList $filePath, "Open", "Read"
    $media.SetStream($stream, $extension);
    $stream.Close();
}

function Find-MediaItem {
    [CmdletBinding()]
    param(
        [Parameter(Position=0, Mandatory=$true, ValueFromPipeline=$true)]
        [ValidateNotNullOrEmpty()]
        [string]$fullName,
		
		[Parameter(Position=1, Mandatory=$true, ValueFromPipeline=$true)]
        [ValidateNotNullOrEmpty()]
        [string]$extension
		)
        
	$mediaName = [io.path]::GetFileNameWithoutExtension($fullName)
	$uploadPosition = $fullName.IndexOf("App_Data\upload\")
	$SitecorePath = $fullName.Substring($uploadPosition + 16)
	$mediaNamePosition = $SitecorePath.IndexOf($mediaName)
	$SitecorePath = $SitecorePath.Substring(0, $mediaNamePosition)
	$SitecorePath = "#" + $SitecorePath.replace("\", "#/#")
	if ($SitecorePath.EndsWith("/#")) {
		$SitecorePath = $SitecorePath.Substring(0, $SitecorePath.Length - 2)
	} else {
		$SitecorePath = $SitecorePath + "#"
	}
	$query = "/sitecore/media library//{0}//*[@@name='{1}']" -f $SitecorePath, $mediaName
	Write-Host $query
	$TemplateName = $extension
	if ($extension -eq "pdf") {
		$TemplateName = "Pdf"
	} elseif ($extension -eq "jpg") {
		$TemplateName = "Jpeg"
	} elseif ($extension -eq "jpeg") {
		$TemplateName = "Jpeg"
	} elseif ($extension -eq "png") {
		$TemplateName = "Image"
	} elseif ($extension -eq "svg") {
		$TemplateName = "Image"
	} elseif ($extension -eq "mp3") {
		$TemplateName = "Mp3"
	}
	[Sitecore.Data.Items.MediaItem]$item = Get-Item master: -Query $query | Where-Object { $_.TemplateName -eq $TemplateName } | Select -First 1
	if ($item.ID -eq "") {
		$SitecorePath = $SitecorePath.replace(" ", "")
		$mediaName = $mediaName.replace(" - ", "-")
		$mediaName = $mediaName.replace(" ", "-")
		$query = "/sitecore/media library//{0}//*[@@name='{1}']" -f $SitecorePath, $mediaName
		[Sitecore.Data.Items.MediaItem]$item = Get-Item master: -Query $query | Where-Object { $_.TemplateName -eq $TemplateName } | Select -First 1
	}
	return $item
}

$ImageFileRegEx = '.jpg|.png|.pdf|.jpeg|.svg|.mp3'
Get-ChildItem -Path "$SitecoreDataFolder\upload\" -Recurse | Where-Object -FilterScript {$_.Name -match $ImageFileRegEx} | ForEach-Object {
	$file = $_;
	$extension = [System.IO.Path]::GetExtension($file.FullName);
	$extension = $extension.Substring(1).ToLower();
	$mediaItem = Find-MediaItem $file.FullName $extension
	$mediaPath = "{0}{1}" -f $([Sitecore.Constants]::MediaLibraryPath), $mediaItem.MediaPath
	$mediaId = "{0}" -f $mediaItem.ID
	if ($mediaId -ne "") {
		Write-Host $mediaId $mediaPath
		try {
			Update-MediaItem $file.FullName $mediaPath
			Remove-Item $file.FullName
			Write-Host "Imported and deleted"
		} catch {
			Write-Host "Something went wrong"
		}
	} else {
		Write-Host "Not Found $file.BaseName $extension"
	}
}

Update-MediaItem is the function to attach a file to a Sitecore media item.

Find-MediaItem is the function to search a Sitecore media Item based on the filename. We build a query based on the path of the filename and the template based on the file extension. If the first Get-Item returns nothing, we do another try with some cleaning of the filename.

The main function loops through every file under upload/ (Get-ChildItem). For each file, we try to search the corresponding Sitecore item (Find-MediaItem). If we do have an item, we upload the file to this item and remove the file from the folder (Update-MediaItem and Remove-Item).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: