There comes a time when you need to perform some heavy duty data entry into Sitecore, and that usually involves a lot of images.

Sitecore does offer a very nifty feature, where you can zip up your media, upload the zip and let Sitecore unpack and create the entire folder and file structure within the Media Library. However, this sometimes falls short, due to web server limitations, e.g. what if the zip file ends up being many hundreds of Mb, with a complex folder and file structure? It will take forever to upload, and also forever to import into Sitecore - if it ever makes it into it.

I've had this issue lately, where I had toy perform a lot of data entry into Sitecore, and I had to import some 300Mb worth of images, in a very complex file structure. Breaking up the zip into smaller zips was not really an option, as even the smaller parts were practically impossible to import.

A bit of inventing googling and some glue code got me the solution to this.

Media item creation

First of all, I got a nifty function to add a file to the Media Library, given I have programmatic access to the file location, from Brian Pedersen, at https://briancaos.wordpress.com/2009/07/09/adding-a-file-to-the-sitecore-media-library-programatically/.

I changed his function just slightly to this:

        public MediaItem AddFile(string filepath, string sitecorePath, string mediaItemName)
        {
            // Create the options
            Sitecore.Resources.Media.MediaCreatorOptions options = new Sitecore.Resources.Media.MediaCreatorOptions();
            // Store the file in the database, not as a file
            options.FileBased = false;
            // Remove file extension from item name
            options.IncludeExtensionInItemName = false;
            // Overwrite any existing file with the same name
            options.OverwriteExisting = true;
            // Do not make a versioned template
            options.Versioned = false;
            // set the path
            options.Destination = sitecorePath + "/" + mediaItemName;
            // Set the database
            options.Database = Sitecore.Configuration.Factory.GetDatabase("master");

            // Now create the file
            Sitecore.Resources.Media.MediaCreator creator = new Sitecore.Resources.Media.MediaCreator();
            MediaItem mediaItem = creator.CreateFromFile(filepath, options);
            return mediaItem;
        }

This is, if you have the sitecore folder path, the new media item name and the file path, the above function will create the media item.

Media item name

To get the media item name, I opted to use the file name without the extension - i.e. what Sitecore does anyway. So I needed to take the filename and remove the last part (the extension). Just splitting on the "." would not cut it, because a name such as "image.date.jpg" is a valid filename, and I would get a "image" instead of "image.date". if I just split on the "." and took the first non empty element.

So I found a function that, given an IEnumerable, returns an IEnumerable with the last N elements removed. Here it is:

        public IEnumerable<T> SkipLastN<T>(IEnumerable<T> source, int n)
        {
            var it = source.GetEnumerator();
            bool hasRemainingItems = false;
            var cache = new Queue<T>(n + 1);

            do
            {
                if (hasRemainingItems = it.MoveNext())
                {
                    cache.Enqueue(it.Current);
                    if (cache.Count > n)
                        yield return cache.Dequeue();
                }
            } while (hasRemainingItems);
        }

Then, to get the Item name, it's just a matter of implementation of the above:

            var itemName=string.Join(".",SkipLastN(file.Name.Split(new[] {'.'}, StringSplitOptions.RemoveEmptyEntries), 1));

Where file is the FileInfo object for the file I want to upload.

File Path

That's just the "FullName" property of the FileInfo object for the file I want to upload.

Sitecore folder path

This was the trickiest one for me. I wanted to maintain the directory structure intact. So what I thought was: 

1. All files to be uploaded must ultimately exist under a single folder. This means my import mechanism would import a folder, its subfolders and files recursively. So I needed a "root" folder path for the source.

2. I might not want to put the uploaded structure directly under the Media Library folder, but under a subfolder in Sitecore. So I needed a "root" folder path for the destination in sitecore.

The end result should be that the contents of the destination root folder in sitecore should reflect the folder and file structure of the source folder one-to-one.

DirectoryInfo to the rescue! I jotted down the following:

        public void ProcessFolder(DirectoryInfo folder, string destination, HttpContext context)
        {
            context.Response.Write(string.Format("Entering {0}...{1}", destination, Environment.NewLine));
            context.Response.Flush();
            foreach(var file in folder.GetFiles())
            {
                processFile(file, destination, context);
            }
            foreach(var subfolder in folder.GetDirectories())
            {
                var sub = subfolder.FullName.Replace(folder.FullName, "").Trim('\\').Trim('/');
                ProcessFolder(subfolder, destination + "/" + sub, context);
            }
        }

The destination string is the Sitecore folder path that corresponds to the folder being processed. As you can see, the function recursively builds the destination folder path, given an initial path and an initial DirectoryInfo object. HttpContext is ther just for reporting reasons (i.e. context.Response.Write).

As for the ProcessFile function:

        public void processFile(FileInfo file, string sitecoreFolder, HttpContext context)
        {
            var itemName=string.Join(".",SkipLastN(file.Name.Split(new[] {'.'}, StringSplitOptions.RemoveEmptyEntries), 1));
            if (string.IsNullOrEmpty(itemName))
            {
                itemName=file.Name.Trim('.');
            }
            var mi = AddFile(file.FullName, sitecoreFolder, itemName);
            context.Response.Write(string.Format("created {0}{1}",mi.Path ,Environment.NewLine));
        }

You can see the IEnumerable popping mthod in action there, as well as the Addfile method.

Putting it all together

Finally, the calling code:

        public void ProcessRequest(HttpContext context)
        {
            context.Response.ContentType = "text/plain";
            var source=context.Request.QueryString["source"].Trim('/');
            if (string.IsNullOrEmpty(source))
            {
                return;
            }
            var directory=context.Server.MapPath(Sitecore.IO.TempFolder.Folder + "/"+source);
            var destination=context.Request.QueryString["destination"].Trim('/');
            if (string.IsNullOrEmpty(destination))
            {
                return;
            }
            var dest = Sitecore.Data.Database.GetDatabase("master").GetItem(Sitecore.ItemIDs.MediaLibraryRoot).Paths.Path + "/" + destination;
            ProcessFolder(new DirectoryInfo(directory), dest, context);
        }

This is just the "ProcessRequest" method of a Generic handler (ashx). I uploaded all the files, in their original directory/file structure, to the temp folder in the Sitecore installation. I then created the "root" folder in Media Library, and ran the above handler, providing source and destination as querystring parameters. So for images uploaded under /temp/MyImages, and destination in /Sitecore/Media Library/SomeSite/Images, and name of handler "ImageImporter.ashx", the url would be

ImageImporter.ashx?source=MyImages&destination=/SomeSite/Images

And that's it! Running the importer, I went through 300Mb worth of images , in about 100 folders and subfolders in less than 20 seconds.

In the link below, you may find the full code for the ImageImporter handler.

Happy coding!

ImageImporter.ashx.cs