Drupal
  |  25th October, 2016

Programmatically updating URL aliases using Batch API in Drupal 8

PURUSHOTAM RAI

Drupal has always had excellent support for human-friendly URL’s and SEO in general, from early on we have had the luxury of modules like pathauto offering options to update URL for specific entities, token support and bulk update of URLs. Though bulk update works for most cases, there are some situations where we have to update URLs programmatically, some of the use cases are:

  • Multisite setup -- When your setup has many sites, its inconvenient to use bulk update UI to update aliases for each of your sites. Programmatically updating the aliases is a good choice and can be executed via the update hooks.
  • Conditional Update -- When you wish to update aliases based on certain conditions.

 

In Drupal 8 Pathauto services.yml file we can see that there is a service named ‘pathauto.generator’ which is what we would need. The class for corresponding to this service, PathautoGenerator provides updateEntityAlias method which is what we would be using here:

public function updateEntityAlias(EntityInterface $entity, $op, array $options = array())

 

EntityInterface $entity: So, we need to pass the entity (node, taxonomy_term or user) here.
String $op: Operation to be performed on the entity (‘update’, ‘insert’ or ‘bulkupdate’).
$options: An optional array of additional options.

Now all we need is to loop the entities through this function, these entities may be the user, taxonomy_term or node. We will be using entityQuery and entityTypeManager to load entities, similar to the code below

  $entities = [];
  // Load All nodes.
  $result = \Drupal::entityQuery('node')->execute();
  $entity_storage = \Drupal::entityTypeManager()->getStorage('node');
  $entities = array_merge($entities, $entity_storage->loadMultiple($result));

  // Load All taxonomy terms.
  $result = \Drupal::entityQuery('taxonomy_term')->execute();
  $entity_storage = \Drupal::entityTypeManager()->getStorage('taxonomy_term');
  $entities = array_merge($entities, $entity_storage->loadMultiple($result));

  // Load All Users.
  $result = \Drupal::entityQuery('user')->execute();
  $entity_storage = \Drupal::entityTypeManager()->getStorage('user');
  $entities = array_merge($entities, $entity_storage->loadMultiple($result));

  // Update URL aliases.
  foreach ($entities as $entity) {
    \Drupal::service('pathauto.generator')->updateEntityAlias($entity, 'update');
  }

This works fine and could be used on a small site but considering the real world scenarios we generally perform this type of one-time operation in hook_update_N() and for larger sites, we may have memory issues, which we can resolve by involving batch API to run the updates in the batch. 

Using Batch API in hook_update_n()

As the update process in itself a batch process, so we can’t just use batch_set() to execute our batch process for URL alias update, we need to break the task into smaller chunks and use the $sandbox variable which is passed by reference to update function to track the progress.  

The whole process could be broken down into 4 steps:

  • Collect number of nodes / entities we would be updating the URL for.
  • Use the $sandbox to store information needed to track progress.
  • Process nodes in groups of a certain number (say 20, in our case ).
  • And finally setting the finished status based on progress.
function mymodule_update_8100(&$sandbox) {
  $entities = [];
  $entities['node'] = \Drupal::entityQuery('node')->execute();
  $entities['user'] = \Drupal::entityQuery('user')->execute();
  $entities['taxonomy_term'] = \Drupal::entityQuery('taxonomy_term')->execute();
  $result = [];

  foreach ($entities as $type => $entity_list) {
    foreach ($entity_list as $entity_id) {
      $result[] = [
        'entity_type' => $type,
        'id' => $entity_id,
      ];
    }
  }

  // Use the sandbox to store the information needed to track progression.
  if (!isset($sandbox['current']))
  {
    // The count of entities visited so far.
    $sandbox['current'] = 0;
    // Total entities that must be visited.
    $sandbox['max'] = count($result);
    // A place to store messages during the run.
  }

  // Process entities by groups of 20.
  // When a group is processed, the batch update engine determines
  // whether it should continue processing in the same request or provide
  // progress feedback to the user and wait for the next request.
  $limit = 20;
  $result = array_slice($result, $sandbox['current'], $limit);

  foreach ($result as $row) {
    $entity_storage = \Drupal::entityTypeManager()->getStorage($row['entity_type']);
    $entity = $entity_storage->load($row['id']);

    // Update Entity URL alias.
    \Drupal::service('pathauto.generator')->updateEntityAlias($entity, 'update');

    // Update our progress information.
    $sandbox['current']++;
  }

  $sandbox['#finished'] = empty($sandbox['max']) ? 1 : ($sandbox['current'] / $sandbox['max']);

  if ($sandbox['#finished'] >= 1) {
    return t('The batch URL Alias update is finished.');
  }

}

The process of loading the entities will differ in case we are just updating a single entity type say nodes. In that case, we can use loadMultiple to load all the entities at once per single batch operation. That’s a kind of trade-off we have to do according to our requirements. The crucial part is using sandbox variable and splitting the job into chunks for batch processing.

Looking for a Drupal partner ?

We are drupal 8 ready