Friday 2 April 2021

Sitecore Content Serialization - first look

Agenda

  1. Preparations
  2. Configuration
  3. Module Configuration
  4. Performing Serialization Operations in CLI
  5. How to migrate from Unicorn to SCS
  6. Generating and Installing Item Packages


I used to use Unicorn and Sitecore TDS for past Sitecore solutions. Both tools are good but now, when one of my projects is about to finish, I expect to start with the new one soon and decided to have a look at new Sitecore Content Serialization that is available starting from Sitecore 10. 

According to Sitecore:  

Sitecore Content Serialization (SCS) is a system for serializing, sharing, and deploying content items, as well as keeping them in version control.

Basing on the description, I expect a new SCS tool to replace the tools I was using. So, let's have a look how good the SCS is.

Note: personally, I am not a big fun of TDS and prefer to use Unicorn. Thus, I will be comparing SCS  with Unicorn only.

Note: all assumptions in this article are based on my investigations and might be wrong. I do not have experience with using the tool and might be incorrect in decisions I made.    

To start my investigations I have installed a fresh Sitecore 10 version to my development PC and created very simple solution with three empty Foundation, Feature and  Project projects. I will be using it to see how SCS can solve the difficulties I saw with my Sitecore solutions in the past.

Preparations

To use Sitecore Content Serialization you need to install Sitecore Management Services to your Sitecore CM instance, Sitecore Command Line Interface and perform some additional configuration steps. More about this you can read below.
I will be using CLI, since I prefer the CLI interface instead of GUI. Also, CLI is going to be used during deployment, thus I want to use the same approach on development instance.

Install Sitecore Management Services 

First of all I have figured out that I need to install Sitecore Management Services on my CM instance to start using Sitecore Content Serialization. This is just a regular Sitecore package that contains files only.

This package installs a service that allow to connect to Sitecore using Sitecore Command Line Interface or Sitecore for Visual Studio and execute serialization commands. 

When we use Unicorn, we also need to copy some Unicorn specific files to Sitecore, so this looks familiar. In most cases, Unicorn specific files were part of the build artifacts and pushed to Sitecore during deployment. 

For Sitecore Management Services I expect to use the same approach. To do this, I am going to unpack the package and keep it as part of my solution. My development deployment scripts will just copy files to build artifacts that will be moved to Sitecore CM instance at some point. This approach automate the package installation process and we do not need to ask all developers to install it manually. 

BTW: the first thing I did was downloading the latest version of Sitecore Management Services (which was 3.0.0) and installed to Sitecore 10. As a result... Sitecore did not start and failed with an exception. After brief investigation, I have figured out that Sitecore 10 works with version 2.0.0 and version 3.0.0 is for Sitecore 10.1. It was a bit unclear from the download page... at least for me :)

Install Sitecore Command Line Interface

Good. The next step is to install Sitecore CLI. Installation details you can find here. Basically, it is just a few commands that needs to be executed in the root folder of your solution (in case you install it to the local project).

As a result, you will get a folder with name ".config" with only one file inside: "dotnet-tools.json". If you open the file, you can see the version of the CLI that is used. I had to change it to 2.0.0 since I was not sure that tool 3.0.0 will work with Sitecore 10 services.

Unicorn, in its turn, has an administration web page to do serialization commands or Powershell script to execute commands from Visual Studio tasks. 

Both approaches used in Unicorn and SCS look fine to me. Good job!

Configuration

Before we start using serialization, we need to connect CLI with Sitecore instance. The process is described here. Basically, you need to use either interactive or non-interactive login. Again, I decided to use non-interactive one since this option is going to be used in deployment pipeline. After I did all necessary changes to Sitecore Identity Server and Sitecore CM I tried to execute login command as described in the documentation but got a message in Powershell. The message says that I need to execute "sitecore init" command first. Looks like I missed this step when travelling from one documentation page to another one.

"dotnet sitecore init" command needs to be executed in the root folder of your solution. After execution file "sitecore.json" will be created. This file contains configuration for the content serialization. More information about possible settings you can find here: general descriptionItem path length, Relative path, etc.

After review this file I noticed a few things:

1. The path to the module configuration does not correspond to my solution project

2. The path for serialization folder is located under the project while I used to see serialization folder on the same level as code folder that contains the project.

My spike project has the layout below:
/src/Feature/Feature Project/code - feature project code

/src/Feature/Feature Project/serialization - feature project serialization

/src/Foundation/Foundation Project/code - foundation project code

/src/Foundattion/Foundation Project/serialization - foundation project serialization

/src/Project/Project Project/code - solution project code

/src/Project/Project Project/serialization - solution project serialization


Thus, I changed modules property in sitecore.json file to the folowing:

  "modules": [

    "src/*/*/*/*.module.json"

  ],

This change ensures that all my project serialization configuration can be found and I do not need to update the file when I add a new project.
Also, I need to move serialization folder one level up to locate on the same level as code folder. Thus, I changed setting defaultModuleRelativeSerializationPath to one level up:
  "defaultModuleRelativeSerializationPath": "../serialization"

Seems fine but once I see serialization setting I have got two question.

Q1: Is it possible to have a single serialization folder for all modules? Sometimes you need to get all serialization files in one place and it much faster to copy if all serialization files are stored on the same level as solution file. In most of my projects, build script needs to search for all serialization files an copy it under the same root artifacts folder for further deployment. This operation was pretty time consuming and in some project there was a decision to change the default Unicorn serialization root by changing the "physicalRootPath" attribute of the "targetDataStore". The default value was:

<targetDataStore physicalRootPath="$(sourceFolder)\$(layer)\$(module)\serialization" useDataCache="false" singleInstance="true" />

A1: I expect to use the same trick with SCS but... I was not able to find a better way then specifying the full path to the serialization folder. Well, the answer to this question is Yes, it is possible but... are you sure that all developers will be using the same path to the solution and we can safely hardcode it? What about build agent?

I hope there is a way to specify the relative path basing on the solution and not project. Probably I just did not find it.

However, while trying to find an answer for Q1 I got another question.

Q2: is it possible to override settings in this file? I understand that all settings in this file must be the same for all developers, otherwise they we get conflicts with serialization files. However, in case there is no way to  specify serialization path relative to solution and we have to use full path, I need a possibility to override the setting on developer PC.

A2: I was not able to find the answer to this question. Probably, we do not even need to do such things and it is good that we cannot override :).


Module Configuration

Ok, now we need to create configuration file for each project that have serialization in my solution. I failed to find a command in CLI that can create a default configuration file. Thus, I need to copy the example from here or here.

Note that suggested naming convention for this file is <name of module>.module.json

This means that just copy\paste will not work and you need to rename the file for every project. This might be fine. What is much worth here is that in the file, you have a few more places where you need to specify module name:

namespace - required -  The namespace of the module

name - required - name of the folder that will keep the serialization

Hm.. there is a huge chance to do a typo or forget to update in case of using copy \ paste from the other project. Unicorn using variables \ tokens in configuration which make things much easier and you need to specify name of the module \ project only once. 

I have seen a variables property in file sitecore.json but I am not sure how to use it since I failed to find any documentation regarding variables.

Apart the problem  mentioned above, Sitecore has pretty rich rules configuration to control what is going to be serialized, what should be skipped and how to update the content if it already present in Sitecore. I was not able to find a situation that is covered by Unicorn but not Sitecore. More about rules you can read here.

However, I have a few questions that I was not able to find an answer for.

Q1: How to change the serialization behavior basing on environment? For example, I am developing a new feature an need some configuration item is present only on development environment but not on UAT and above. Unicorn is based on Sitecore configuration files and I can use roles to configure or patch the behavior. How do a similar thing with SCS?

A1: Well... You probably do not need the same thing here. The difference with Unicorn configuration and SCS configuration is that Unicorn related configuration is pushed to Sitecore and used when you do a Sync during deploy while SCS configuration is always part of the solution and never go to Sitecore. The delivery of items to Sitecore is different here - your build script produces an item package to be installed during deploy instead of working with serialization files. When installing package, you can specify  which modules need to be installed or skipped. So, the answer to the question is: kind of yes but, instead of using configuration rules you need to have a separate development specific module in project layer that brings only development specific items. At least I would try this approach now.

Q1: How to change the behavior for the item installation? For example, I want the item to be overridden for non-production environment but created only when it does not exist on Prod environment. This requirement is valid for some configuration items that needs to have a different value for different environments.

A2: I did not find such a possibility except of using different environment specific project modules.

Performing Serialization Operations in CLI

Except serialization operations like Push and Pull Sitecore offers pretty useful operations like the ones below:
  • explain - Explains whether a content item path is included and why
  • info - Shows serialization configuration information
  • diff - Compares the content items of two Sitecore instances. Well, this is an interesting one but not sure I will use it often. I would like to have a command that does a dry run pull from Sitecore to file system and show me  potential changes :)
  • etc.
The full lists of supported commends can be found here.
Also, I was impressed of Sitecore CLI messaging implementation. It was pretty easy for me to get the cause of the problem if something was wrong with configuration or operation I tried to execute. Error messages were precise and informative. Well done!

Q1: Unicorn has a possibility to do an automated sync changes to the solution from Sitecore. Is it possible to use similar automated sync with SCS?

A1: Yes, CLI has a command "watch" that monitors changes to content items in a Sitecore instance and automatically serializes the changes to your file system. However, I noticed that in case you get a problem with your serialization files or configuration, the watch mode exits and sync is not performed any longer. This might be a problem for developer since the scenario when you have merge conflicts is common. If every time you have a conflict you will be exit from watch mode, developer might forget to enable it again and some changes will not be moved to source control. I see this as a problem.

Also, I have figured out that Sitecore creates a file ".scindex" in the serialization folder. This file should not be committed to source control, thus, needs to be added to git ignore.  Another interesting question regarding this file is performance - the file contains some serialization content information and I wondering whether performance will slow down in case of having hundreds or even more of files. Will see... 

How to migrate from Unicorn to SCS

The pretty interesting question to me was about moving from Unicorn to SCS. Is it possible?
I compared the yml files and format looks identical, thus in theory, it should be possible.
However, Unicorn and SCS are using a different algorithm for calculating long paths. What will happen for these items then?

I have copied part of the serialization from my other project to a sample one and updated IDs and Paths in these files to meet new solution. When I tried to execute "push"   command I got the following message:

[/sitecore/content/Can be overriden] INCORRECT FILE PATH: ~/_page rendering components,
[/sitecore/content/Can be overriden] found:    ~\Can be overriden\8dcbf151-d484-4c3c-a68e-8f782cf9ae3e\_page rendering components.yml,
[/sitecore/content/Can be overriden] expected: ~\Can be overriden\_page rendering components.yml
[/sitecore/content/Can be overriden] > Fix will move to correct location

 I followed the suggested approach and executed the command below:

dotnet sitecore ser validate --fix

This fixed the problem and moved the file to appropriate location!

Thus, it looks to be possible to migrate. The only question is how CLI behaves when hundreds of files needs to be moved. 

Thumbs up to Sitecore!  

Generating and Installing Item Packages

There are package operation commands in CLI. More about this here

As I understand, Sitecore suggests to generate item package during the build pipeline and use package to deliver items instead of doing Sync from serialization like we did with Unicorn.

However, let's return to the questions I have in section "Module Configuration". If I need to have environment specific modules to specify environment specific behavior, I need to be able to either generate environment specific package or specify what modules needs to be ignored during package installation.

Right, Sitecore offers this behavior and include \ exclude modules for both commands: when generating and installing package. Nice!

However... I expected to specify top root project module during package generation and get all content it depends on (module configuration has references section where you can specify dependencies). Unfortunately, I got only content that belongs to the specified module in include regardless configuration in references property :( Hope this is just an issue that is going to be fixed.

Anyway, I was able to generate production package  using exclude parameter and remove all non-production data. Good.

Q1: Where to use exclude \ include parameters? Is it better to do on package generation or during installation?

A1: Well, from one side, I do not want to spend the time on generating more than one package in build pipeline. This is not as fast as I would like to be with large amount of serialization files. At the same time, having different package for different environments minimizes a risk of getting non-production content on production environment. So, I would try to minimize a possibility to have more then one package and keep all environment specific configuration in configuration files instead of using content. However, if we do need to have different content, I would probably choose to have a separate packages - one for production content and another that have only environment specific content that needs to be installed after production one.

Q2: How to install item package in deployment pipeline?

A2: Good question... I do not know yet since I did not go so far :) I believe that we need to install CLI on deployment agent. Hope this is an easy task :)

Summary

In this post I described my impression from using new Sitecore Content Serialization that supports Sitecore 10+ version.

There are some questions and hidden areas that bubbled up during my spike. Probably I will get even more when I start using SCS with the real project. However, I must admit that Sitecore did a great job and new service looks very good. 

I will definitely would like to try it with the next project!

Wednesday 13 June 2018

Some template sections or fields are missing after using Update Data Templates commerce command in Sitecore 9.0.1

Recently I have been involved in the project based on Sitecore 9.0.1 + Sitecore SXA + Sitecore Commerce.
This was really a great experience to be and allow to look at some things from a different angle :)

In this post, I would like to share some problem I faced when extending the schema for the Sellable Item in commerce.
As mentioned above we needed to extend the default fields we can see for a sellable item in Commerce interfaces. This was not the easiest task but we managed to do this with the KB article (big thanks to everybody who was publishing it). After we added a new section we had to regenerate Commerce templates in Sitecore backend to see the new section in content items.

Problem
After we updated Sitecore Commerce templates using "Update Data Templates" commands in "Commerce" ribbon tab we figured out that some sections that were present before are missing now. The one that we noticed was section "Images". What is even more interesting was the fact that this problem appeared only in a few developers only.

Investigation and solution
After some investigations, we have found that is it "CatalogTemplateGenerator" who is responsible for template generation. The problem was related to that fact that it generates templates basing on the first available object (e.g. Category templates are generated based on the first found Category commerce entity, Sellable Item templates are generated basing on the first available sellable item commerce entity etc.). In our case, it found the same commerce entity for all developers but some developers did have images assigned to the sellable item entity in their commerce database but others - not.
The problem is happening to be jing of specific since it will not be reproduced for the bost of the sections and fields. But images field is a bit special since, on the commerce side, it is implemented as a list control. In case it has no value the information about images is not added to the entity json returned from commerce server.

To fix this problem we had to override default commerce command with an own one and use our own implementation of template generator. Unfortunately, we were not able to override implementation from the standard implementation since all methods in "CatalogTemplateGenerator" class are private and non-virtual. Thus, we had to copy-paste implementation using reflector and change the implementation of "EnsureTemplateFields" method from (taken from dotPeak):
private void EnsureTemplateFields(TemplateItem templateItem, JToken view, string section = "Content")
    {
      EntityView entityView = view.ToObject<EntityView>();
      if (entityView.Properties.Any<ViewProperty>())
      {
        foreach (ViewProperty property in (Collection<ViewProperty>) entityView.Properties)
        {
          TemplateFieldItem templateFieldItem = templateItem.GetField(property.Name) ?? templateItem.AddField(property.Name, section);
          using (new EditContext(templateFieldItem.InnerItem))
          {
            templateFieldItem.Title = property.DisplayName;
            templateFieldItem.InnerItem.Appearance.ReadOnly = true;
            templateFieldItem.InnerItem.Appearance.Hidden = property.IsHidden;
            templateFieldItem.InnerItem[TemplateFieldIDs.Shared] = this.IsFieldLocalizable(view, property.Name) ? "0" : "1";
            templateFieldItem.Type = !(property.OriginalType == typeof (bool).ToString()) ? (property.OriginalType == typeof (double).ToString() || property.OriginalType == typeof (int).ToString() ? "Number" : "Single-Line Text") : "Checkbox";
          }
        }
      }
      else
      {
        if (((IEnumerable<string>) this._blacklist).Contains<string>(entityView.Name) || !entityView.ChildViews.Any<Model>() || view[(object) "ChildViews"].FirstOrDefault<JToken>() == null)
          return;
        TemplateFieldItem templateFieldItem = templateItem.GetField(entityView.Name) ?? templateItem.AddField(entityView.Name, section);
        using (new EditContext(templateFieldItem.InnerItem))
        {
          templateFieldItem.Title = entityView.DisplayName;
          templateFieldItem.InnerItem.Appearance.ReadOnly = true;
          templateFieldItem.InnerItem.Appearance.Hidden = false;
          templateFieldItem.InnerItem[TemplateFieldIDs.Shared] = this.IsFieldLocalizable(view, entityView.Name) ? "0" : "1";
          templateFieldItem.Type = "Treelist";
        }
      }
    }
to something like this:
protected virtual void EnsureTemplateFields(TemplateItem templateItem, JToken view, string section = "Content")
        {
            EntityView entityView = view.ToObject<EntityView>();
            if (entityView.Properties.Any<ViewProperty>())
            {
                foreach (ViewProperty property in entityView.Properties)
                {
                    TemplateFieldItem templateFieldItem =
                        templateItem.GetField(property.Name) ?? templateItem.AddField(property.Name, section);
                    using (new EditContext(templateFieldItem.InnerItem))
                    {
                        templateFieldItem.Title = property.DisplayName;
                        templateFieldItem.InnerItem.Appearance.ReadOnly = true;
                        templateFieldItem.InnerItem.Appearance.Hidden = property.IsHidden;
                        templateFieldItem.InnerItem[TemplateFieldIDs.Shared] =
                            this.IsFieldLocalizable(view, property.Name) ? "0" : "1";
                        templateFieldItem.Type = GetPropertyType(property);
                    }
                }
            }
            else
            {
                if ((this._blacklist.Contains<string>(entityView.Name) ||
                    !entityView.ChildViews.Any<Model>() || view[(object) "ChildViews"].FirstOrDefault<JToken>() == null) && !entityView.Name.Equals("images", StringComparison.OrdinalIgnoreCase))
                    return;
                TemplateFieldItem templateFieldItem =
                    templateItem.GetField(entityView.Name) ?? templateItem.AddField(entityView.Name, section);
                using (new EditContext(templateFieldItem.InnerItem))
                {
                    templateFieldItem.Title = entityView.DisplayName;
                    templateFieldItem.InnerItem.Appearance.ReadOnly = true;
                    templateFieldItem.InnerItem.Appearance.Hidden = false;
                    templateFieldItem.InnerItem[TemplateFieldIDs.Shared] =
                        this.IsFieldLocalizable(view, entityView.Name) ? "0" : "1";
                    templateFieldItem.Type = "Treelist";
                }
            }
        }

Also, since already had own generator we decided to update the logic that resolves the template fields type. This logic was noticed in the same method and looks like the one below:
templateFieldItem.Type = !(property.OriginalType == typeof (bool).ToString()) ? (property.OriginalType == typeof (double).ToString() || property.OriginalType == typeof (int).ToString() ? "Number" : "Single-Line Text") : "Checkbox";

As you can see only "Number", "Single-Line Text" and "Checkbox" fields types are supported + some special logic for lists. We decided to change the implementation a bit and add support for additional types - "html" and "memo". We have extracted this to a separate method:
protected virtual string GetPropertyType(ViewProperty property)
{
  string type = "Single-Line Text";
  if (property.OriginalType == typeof(bool).ToString())
  {
    type = "Checkbox";
  }
  else if (property.OriginalType == typeof(double).ToString() ||
                     property.OriginalType == typeof(int).ToString())
  {
    type = "Number";
  }
  else if (string.Equals(property.OriginalType, "html", StringComparison.OrdinalIgnoreCase))
  {
    type = "Rich Text";
  }
  else if (string.Equals(property.OriginalType, "memo", StringComparison.OrdinalIgnoreCase))
  {
    type = "Multi-Line Text";
  }

    return type;
}

After these changes, the problem with missing "images"  section has gone and fields started to look better in Content Editor.


Tuesday 12 June 2018

Changing service user for Sitecore Commerce

Recently, I have been working on the pre-production preparations for the Sitecore 9.0.1 based solution. One of the things to be done was either disable default Admin user or change its password to be more secure.

Once I changed the password for default Admin user and tried to open Sitecore Commerce interfaces I was not working. Browser Dev tools show a number of Javascript errors in console relater to the communication problems. I was even more surprised when figured out that updated Admin user cannot login into Sitecore backend.

After some investigations, I have figured out that Admin user is locked for some reason. Unfortunately, after unlocking Admin used it was locked again and again which made me think that someone is trying to login using an incorrect password and Sitecore locks the user.

The hardcoded Admin user has been found in Commerce Server configurations, in the file below:
/wwwroot/data/Environments/PlugIn.Content.PolicySet-1.0.0.json
To workaround the problem I decided to register a dedicated Commerce admin user and reconfigure Commerce server to use an own user instead of standard one.

Note: do not forget that all Commerce instances “Authoring”, “Shops”, “Minions” and “Ops” needs to be updated. To do this, one should open “PlugIn.Content.PolicySet-1.0.0.json” located in “/wwwroot/data/Environments” in every site mentioned above and update properties “UserName” and “Password”.

After changing Commerce Server configuration, one needs to bootstrap Sitecore Commerce server again to ensure the configuration is applied. This should be done using Postman requests (use GetToken to get the auth token and then call “Bootstrap Sitecore Commerce”)

Thursday 25 January 2018

Sitecore cache problem when iterating through the content using API

Sitecore uses a number of various caches to keep the most frequently used data in memory instead of making requests to a database every time API needs the data.

The most important thing here to remember is that the cache should keep the necessary data only. However, there are a number of cases when things might get wrong and cache will start keeping data that has been accessed once. In this case, next time, you need the item which is not stored in cache, Sitecore will make a database request and cache the item again. Later, the obsolete data will be removed from the cache (in other words, the cache will be recovered) but this requires some time. During the recovering time, the performance of the processing requests will be decreased.
Things might become even worth in case the database server is located far from the Web servers and latency start playing its role.

Which operations might cause problems with cache?


I would say that any that work with a great number of data items that are not supposed to be cached or requested again during a short period of time.

A few examples:

1. Reindexing data. 
In this case, the indexing API iterates thought the Sitecore data (tree) and perform an indexing operation. As a result, all indexing items will be added to Sitecore caches. Just imagine, what will happen in case you are reindexing the whole content tree with millions of items... The cache will be overfilled soon and Sitecore will start cleaning it up in the middle of the indexing operation. Thus, instead of just indexing data, the processor time is spent on cache operations and further data clean up.

2. Publishing data
Similarly to the indexing, the publishing process is accessing a number of data, thus a number of odd items are cached.

3. Sitecore initialization logic
Sitecore initialization logic performs a number of operations that also require access to the items. For example loading localization data, scanning some items to load application settings, etc.


What can we do to improve the situation?

The easiest solution that came to my mind is using a sort of cache disabling context that would prevent data to be cached.
Sitecore allows disabling data caches by using Sitecore.Data.DatabaseCacheDisabler:

using(new DatabaseCacheDisabler())
{
  // your code here...
}
All the code, executed in DatabaseCacheDisabler context will not get and put data from \ to Sitecore caches. Another good thing is that DatabaseCacheDisabler inherits from Switcher class that is thread static. This means that cache disabling logic will be performed in current context only and will not influence other threads.
The bad thing about this code is that it will not get data from the cache even if it is there! In other words, if part of the items, you need to work with, from your code is already in cache Sitecore will not use them and make a request to a database by introducing a performance penalty.

To improve the performance we would rather want to get data from cache if it is already there but do not add new data to Sotecore caches if we read it from a database.

After some investigation, I have managed to find the switcher that does the trick: Sitecore.Data.CacheWriteDisabler.
The usage of this class is quite the same as the previous one but the code in the scope of this disabler will be using cached data if it is available.
Performance tests confirmed that the code in the scope of this disabler works faster than when I was using just DatabaseCacheDisabler.

At the end, I would like to warn about using of disablers in Sitecore.
One should remember that in case the code inside the disabler not only reads the items but also modifies them, then changes you did with these items will not appear in Sitecore caches if these items have alreay been cached.
Thus, one should use disablers wisely without breaking the cache integrity.



Saturday 6 January 2018

Using PowerShell with Sitecore

Recently, I have been asked whether it is possible to use PowerShell to work with Sitecore services.
I have never tried this before. I knew about existing modules that allow managing Sitecore via PowerShell.
After brief research, I found a few but all of them require a custom package to be installed on the Sitecore instance. This is not what I would like to do without knowing all the details about the package to be installed. It would be interesting to check what we can do with a clean Sitecore instance.

As a starting point, I have installed clean Sitecore 8.2 Update-3 (with the hostname sitecore82u3) and started my experiments.

Requesting not protected page using PowerShell

The task seems trivial but still quite useful.
For example, you might have a page that would return some statistics or perform some actions basing on passed parameters. I have decided to try to request sample layout.aspx page that is located in /layouts folder by default.
The PowerShell command looks simple:
Invoke-WebRequest -uri "http://sitecore82u3/layouts/sample layout.aspx"
I have got a response with status code 200 and page content.
Good, but what if I need to request security protected page that requires login before I continue?

Requesting protected page using PowerShell via login dialog

Once I request protected page e.g. /sitecore/admin/cache.aspx I will receive 200 response code but from the content, I can figure out that my request has been redirected to the login page:
Invoke-WebRequest -uri "http://sitecore82u3/sitecore/admin/cache.aspx"

Response content fragment:
...
<title>
        Sitecore Login
</title>
...

Thus, we need a way to fill in the login credentials before requesting the protected page. The easiest way of doing this is to get the input controls from the page and set some data there:

# define session variable
$session = $null
# url to the login page in admin
$loginUrl = "http://sitecore82u3/sitecore/admin/login.aspx"
$actionResponse = Invoke-WebRequest -uri $loginUrl -SessionVariable session -UseBasicParsing
$fields = @{}
#search for input fields
$actionResponse.InputFields.ForEach({
if($_.PSobject.Properties.name -match "Value"){
    $fields[$_.Name] = $_.Value
  }
})
# Set login info. Note: this code is specific for admin login page. For standard Sitecore 8.2 Update-3 login page one should use $fields.UserName and  $fields.Password
$fields.LoginTextBox = "sitecore\my_user"
$fields.PasswordTextBox = "my_password"
# Perform POST request with credentials
Invoke-WebRequest -uri $loginUrl -WebSession $session -Method POST -Body $fields -UseBasicParsing
# Using authenticated session make a request to a protected page
(Invoke-WebRequest -uri "http://sitecore82u3/sitecore/admin/cache.aspx" -WebSession $session).Content
In result, you will see the cache page content.

Current approach works but... we should remember that current approach works with the html markup that might be changed at some point and your code will not work. It would be better to use a better approach.

Requesting protected page using PowerShell via login endpoint

After some investigations, I have noticed that Sitecore includes Sitecore Client Services component by default. This component allows creating own services easily and provides Login endpoint service. This is exactly what I need.
To authenticate the user we just need to update our script to use service instead of working with login page:
# define session variable
$session = $null
# url to the login page in admin
$loginUrl = "https://sitecore82u3/sitecore/api/ssc/auth/login"
$params = @{"domain"="sitecore";
        "username"="my_user";
        "password"="my_password";
    }
# Perform POST request with credentials
$actionResponse = Invoke-WebRequest -uri $loginUrl -SessionVariable session -Method POST -Body $params -UseBasicParsing
# Using authenticated session make a request to a protected page
(Invoke-WebRequest -uri "http://sitecore82u3/sitecore/admin/cache.aspx" -WebSession $session).Content
After executing this script, I was able to see the content of the protected cache page.

Current approach looks more secure one since the login endpoint requires SSL connection configured and will not allow http connections.
Another important note about this endpoint is related to the fact that, by default, it is configured in a way of allowing local connections only and will reject all remote ones.
In case you need to connect to remote Sitecore solution, I can see at least a few options:

  1. Use PoweShell to connect to the remote server and proxy requests so that real requests are done by remote server to a local Sitecore instance.
  2. Configure Sitecore Client Services to process all requests by changing value of the setting "Sitecore.Services.SecurityPolicy" in "Sitecore.Services.Client.config" configuration file. 

To me, the first option seems more appropriate. It does not open a potential security problem and does not require updating Sitecore configuration.

In addition, it is important to mention that, using described approach you can make calls to Sitecore services based on Sitecore Client Services component, WebAPI component or any other services protected by Sitecore security.


PS: while searching for Sitecore Client Services login endpoint I have also noticed that Sitecore WebAPI component also provides login endpoint: "authenticate". However, I decided not to describe its usage here. The component has not been updated for a long time and the main development has been shifted to a more powerful SSC component.

Friday 3 March 2017

Sitecore Fast query: performance problem

Sitecore Fast Query is designed for retrieving and filtering items from the Sitecore database. Sitecore
Fast Query uses the database engine to execute queries.
Original article with detailed information about the Sitecore Fast query you can find here.

Looks good, right?

The feature is cool, especially when you need to find all items with a specific name or field value which is not possible to do via Database API.
In Sitecore 7+ version one can use Content Search feature for this purpose but... sometimes the index is not up to date or field data might be stored in a different way than you expected, etc.

Getting data using Fast query could be very useful in some cases and do not introduce additional dependencies.

Let's have a look at the performance, is it good enough?

Sitecore Fast Query converts all the query conditions into SQL statements, and this places certain limitations on the use of queries.
It is more or less easy to generate an SQL statement for getting data for the statement that defines item fields and properties conditions (e.g. find items by some field value, by template ID or name and other combinations). But, what is happening when we need to search for items under the specific location?
For example: fast:/sitecore/content/Home//*[@@templatename='Sample Item']

Since Sitecore database structure does not keep item paths in tables and the Sitecore content tree is build basing on PrentID reference, we should create an SQL statement that should join Items table a number of times. 

To improve the performance of searching in the hierarchy and simplify the SQL statement, Sitecore databases have a special table that keeps item-parent relations.  This table updated on every item save \ move \ delete operations. This slows doewn item operations a little bit but allows to use item hierarchy search in Fast Query.

Let's have a look at the queries, corresponding SQL statements and execution time.
Test Data:
800k items in Items table
6M records in Descendants table
3M records in Versioned fields table

Fast Query
fast://*[@@templateid = '{0}']

This query is translated into the simple SQL that searches for all items with a specified template.
The execution time of such request is 0.47 ms

If we change this query a little bit and try to search items with a specific template in a specific location (under content item), the picture will be a little bit different.

Fast Query
fast:/sitecore/content//*[@@templateid = '{0}']

This request is executed ~1500 ms


I have got a little bit worth result with searching by field.

Fast Query
fast://*[@__lock='%\"" + Context.User.Name + "\"%'] is executing ~820 ms

while the same query but under specified location is executing ~3200ms

Fast Query
fast:/sitecore/content//*[@__lock='%\"" + Context.User.Name + "\"%']


Summary

Sitecore fast query can be very effective and useful for cases when one needs to select items using conditions that do not involve item hierarchies.

Some could say that Fast Query does not use cache and always perform requests to SQL server. Well, yes, this is true but fast query execution is not the common type of the requests to the database and spending resources on caching and invalidating the data in this cache might appear more expecnsive then just request the data from database. Also, the data constructed from the fast query request is taken from the cache. Thus, it might be not so bad.

As for the hierarchical data (queries that use item paths, I would recommend to review the query and search for all data that fits your conditions using fast query and then filter this data by location on the Web server side. For huge solutions with a lot of content, this approach will work much better. To test this I have replaced query
fast:/sitecore/templates/System/Analytics/External/Matchers/Client Matchers/* with just Database.GetItem(<item_path>).GetChildren() call and got 10 times faster execution.

Thanks.


Thursday 9 April 2015

What is Sitecore update package?

As you probably know, Sitecore distributes updates via special packages, called Sitecore Update packages. These packages have ".update" extension and can be installed using special page, called UpdateInstallationWizard that can be found in /sitecore/admin folder.

What is in the package and how is it different from standard Sitecore package?

The standard Sitecore package is designed for moving content from one Sitecore instance (server) to another. Thus, the operations that must be supported for this scenario is just adding Sitecore items, files and security roles. In other words it should support Add operations.

The Sitecore update package is designed for upgrading Sitecore instances between minor Sitecore versions. The main difference of upgrading process in comparison to just copying data is that during upgrade the most common operation not Add (install new item or file) but upgrading already existing content with support for analyzing and resolving conflicts.

The main features of the Sitecore update package:

  • The package supports such commands like:
    • Delete items and files;
    • Add items and files;
    • Change specific set of the item fields;
    • Change the file (actually for usual files it is implemented as a replace);
    • Patch Xml file (the feature is disabled by default and not used right now);
  • Generate a "real" rollback (uninstall) package during the package installation. The rollback package will contain the only revert operation for actions that have been performed.
  • Support for Analyze (Dry Run) mode, when you install the package without real installation. Very useful mode for analyzing and fixing potential problems with the package.
  • Support for 2 different installation modes: Install (when we guarantee that Sitecore will contain all features form released Sitecore) and Update (when we guarantee that in case of conflict with customer data, the custom data will not be lost).
  • Support for post installation instructions that allow you to execute a custom code after package installation is finished.
  • Using API from Sitecore.Update assembly, one can generate a snapshot package - the package that contains information about files and items but not their content. This package has a small size can be used as a source for generating update packages. This is very useful since you don't need to have Sitecore instance up and running.

The package format.

Despite the fact the package has extension ".update" this is just usual Zip archive. In this archive you can see folder structure that keeps installation commands grouped by command type. For example, all add item commands will appear in "addeditems" folder, deleted items - in "deleteditems", etc. Every entry in these folders are either just regular folder or xml file that contains command metadata and instructions.

User scenarios we considered when working on update package framework.

Actually when designing the framework we wanted to cover one scenario - simplify the upgrade procedure between minor versions. However the scenario is currently can be achieved using 2 different approaches.

  • Upgrading Sitecore instance.
  • Extracting added and customized content with further installation to the new version of Sitecore (in other words - moving the custom data to a new instance).

The first scenario requires generating an update package that contains differences between Sitecore versions.

Props:

  • A Single package that can work for a number of customers.
  • Does not require access to customer solutions.
  • The package is cumulative which means that using the same package you can upgrade from any version in between source and target.

Cons:

  • It is too generic and does not reflect customization.

The second scenario requires generating an update package that contains only customized data that can be easily moved to a different Sitecore instance.

Props:

  • In some cases it might be much easier to upgrade.
  • Having a single package with custome data simplifies upgrade to any Sitecore version.

Cons:

  • Package generation logic should be moved to a customer side which means that customers should have very strong knowledge about changes we done in Sitecore. Otherwise the upgraded solution might be broken.

How to generate an update package?

All API for generating packages, working with package data and installing update package is located in Sitecore.Update assembly. We just use a set of batch files that calls necessary API.

Also there are a number of custom applications that use our API and allow to integrate package related commands into explorer context menu or even in Visual Studio.

Is it possible to control of what is going to be added to the package?

Yes, the package generation API supports file filters (to filter scanned items) and custom C# filters (to filter or change commands in already generate package). All filters can be configured in configuration file or added from code.

Is there an easy way to explore the update package?

I know at least several tools that could represent the package data in a readable way.

Personally, I either unpack package files and analyze it's structure or use own tool for analyzing package.

The Package Analyzer tool features

  • Has a set of filters to filter package commands.
  • Can perform initial package analysis for a number of dangerous commands like changing configuration files, etc.
  • Allows the extract filtered commands to a text file.
  • Can show the command Xml.
  • Can compare 2 update packages and use WinMerge app for showing diff between commands.

Sitecore Content Serialization - first look

Agenda Preparations Configuration Module Configuration Performing Serialization Operations in CLI How to migrate from Unicorn to SCS Generat...