Migrating Drupal 6 Multigroups to Drupal 7 Field Collections

Submitted by Gergely Lekli on Tue, 02/28/2012 - 5:13pm
Gergely Lekli's picture

Although the CCK3 module had never seen a full release, it was still worth upgrading Drupal 6 sites from CCK2 to CCK3, because the latter added a very compelling tool to CCK's arsenal: Multigroups. This feature allowed users to group several fields together, and enter multiple values into that set of fields as a whole by repeating the field group whenever a new value is added. For instance, let's suppose your node has an event multigroup containing date and location data. If you were to add a new value to this multigroup, a new set of date and location fields would appear, and you could enter the date and location at once such that they are tied together. This is especially convenient if you need to be able to add a variable number of sets. Otherwise, you could create a certain number of date fields, and then the same number of locations fields, but that is not even convenient.

If you are running a Drupal 6 site with Multigroups, here is an important question to consider: What happens to Multigroup fields when you upgrade to Drupal 7? The answer: They convert to regular fields, and you are left with the fields of the multigroup just lying around, each with multiple values in them, seemingly all unrelated. In the example above, we would be left with two individual fields, a date field with a certain number of values, and a location field with the same number of values. At this point, if we wanted to add new date/location pairs, we would need to add new values to the date field, and then add the corresponding values to the location field in the exact same order so that at least the ordinal number gives a guidance as to which date corresponds to which location. This is certainly unacceptable on a production site.

The Multigroup module does not have a Drupal 7 version, so continuing to use that is not an option after the upgrade. There is a Drupal 7 module that provides a very similar functionality: Field collection. I decided to use this module as a replacement on a recently upgraded site. However, there is obviously no upgrade path to migrate data in between two different modules, neither did I find complete documentation about any feasible way to upgrade on drupal.org. There are two viable options, if you are upgrading from D6, and need to preserve the data: leave Multigroups functionality behind, or manually migrate data to Field collections. As you can guess from the title of the article, the data has been migrated on that particular site that I worked on.

Following is a summary of my experiences migrating Multigroups data from a D6 site to Field collections on D7.

Before I could come up with a migration path, I had to determine how Field collection module works. A brief investigation revealed that the module defines a new entity, called field_collection_item.You attach this entity to the node, and the fields that you wish to put in the field set need to be attached to the newly defined entity. As opposed to this approach, Multigroups provides a type of field group where you can simply group the fields that you wish to act as a set. On the database level in Multigroups, the fields in the set are just regular fields, and the grouping is done by assigning values with the same delta together. Field collection, on the other hand, defines a field type that you add to the node and that links the new entities to the node. The figure below illustrates the storage mechanism for both.

For the sake of example, let's suppose we have a node that represents an organization that organizes events. In the node, we have a multigroup called Event, and the multigroup has two fields, Date and Location, so that these data can be recorded for the organization’s events. Now that things have become clear, a reasonable path to migrate data from one to the other seems to be the following.

  • Add a Field collection field to the node. (Called Event in the example)
  • Add the fields to the Field collection field that used to belong to the Multigroup. (Date and Location in the example)
  • Assign as many values to the Field collection field as fields in the multigroup used to have. The values in this field would be Field collection entities.
  • Take the values from the fields in the multigroup one by one, and spread them across the fields in the Field collection entities.

I put together a piece of code that does this migration; I'll give a rundown of the code, followed by the actual steps that I took. The algorithm is not fully automatic, so some field data will need to be manually inserted into it. First, let's start by defining some variables for later use.

  
  $content_type = 'my_content_type';
  $collection_field = 'field_event_collection';
  // List all the fields that are in the multigroup
  $multigroup_fields = array(
    'field_date',
    'field_location',
  );

We need a list of nids of the nodes that have value in the multigroup. I used the data in the first field's table, which should yield the same result as the data of other fields in the multigroup, since, theoretically, when we assign values to the multigroup, we assign values to all the fields in it at once.

  // Get all the nodes that have value in the multigroup.
  $query = db_select('field_data_' . $multigroup_fields[0])
    ->condition('entity_type', 'node')
    ->condition('bundle', content_type);
  $query->addExpression('DISTINCT entity_id', 'nid');
  $query->addExpression('revision_id', 'vid');
  $nodes_result = $query->execute();
  

The next block iterates through the nodes that were identified in the previous step, and collects data from the fields that used to be in the multigroup. The delta value is used to sort the individual fields' values. That is, data with delta=0 in one field will be grouped together with data with delta=0 in the other field.

 
foreach ($nodes_result as $node) {
    // Construct the legacy multigroup for the node from the individual fields.
    $multigroup_data = array();
    foreach ($multigroup_fields as $field) {
      $field_result = db_select('field_data_' . $field, 'field')
        ->fields('field')
        ->condition('entity_type', 'node')
        ->condition('entity_id', $node->nid)
        ->execute();

      foreach ($field_result as $field_item) {
        $multigroup_data[$field_item->delta][$field] = $field_item;
      }

    }
}

The following block does the actual migration. It steps through the multigroups, and creates a Field collection field entry for all of them. Creating a field entry involves adding a row to the field's table, named field_data_{field_name}, and adding an entry to the field_collection_item table, which is used by Field collection module to keep track of its entities. And finally, the last foreach block updates the tables of the fields in the multigroup so that the fields will be assigned to the corresponding Field collection entity as opposed to the node.

   
    // Step through the reconstructed multigroups, which will be collections from now.
    foreach ($multigroup_data as $delta => $data) {
      // Create entry in field_collection_item table.
      $id = db_insert('field_collection_item')
        ->fields(array('field_name' => $collection_field))
        ->execute();

      // Attach collection field data to the node.
      db_insert('field_data_' . $collection_field)
        ->fields(array(
          'entity_type' => 'node',
          'bundle' => $content_type,
          'entity_id' => $node->nid,
          'revision_id' => $node->vid,
          'language' => 'und',
          'delta' => $delta,
          $collection_field . '_value' => $id,
        ))
        ->execute();

      // Go through all the fields in the multigroup.
      foreach ($data as $multigroup_field => $field_data) {
        // Reassign the fields in the multigroup from the node to the collection field instance.
        db_update('field_data_' . $multigroup_field)
          ->fields(array(
            'entity_type' => 'field_collection_item',
            'bundle' => $collection_field,
            'entity_id' => $id,
            'revision_id' => $id,
            'delta' => 0,
          ))
          ->condition('entity_type', 'node')
          ->condition('entity_id', $node->nid)
          ->condition('delta', $delta)
          ->execute();

      }
    }
  }

This migration script is not fully automatic, so some manual preparation need to be taken. Below is a list of the steps that I took. My steps assume that the site has been upgraded to D7, and CCK fields have been migrated using Content migrate module.

  1. In Structure /Content types, go to the Manage Fields section of your content type. Add a new Field collection field. For simplicity's sake, I used the same name as the Multigroup used to have.

  2. Under Structure/Field collections, click on Manage Fields near the field you just created.

  3. Add the fields that your multigroup used to have.

  4. Update the variable definition in the first section of the code to reflect the names of your content type/fields.

  5. Run the code. You will need to create a custom module or insert it into an existing one. Defining the function as a submit handler of a simple confirmation form should be fine, but if your database contains a huge amount of data, you might want to consider using Batch API.

  6. Validate that the fields have migrated. Content in the individual fields should have vanished, and the new collection field should have been populated with data. A simple method to compare the before and after state is opening a browser tab with a node edit form before running the migration script, and opening another tab with the same URL after the script has run. That way, you have a snapshot of that particular node before the process.

  7. Remove the individual fields that used to be in the multigroup from the content type.

  8. The fields in the multigroup used to store multiple values - one field stored the values for the repeated field set. After migration, one value is stored per fields. Because of that difference, you now probably have a multi-value field set with multi-value fields. We don't need that anymore, so the fields in the field set can be set to allow only 1 value. The number of repetitions you wish to allow for the field set can be set in the field collection field's allowed values setting. Be sure not the set the allowed values to 1 in the field settings when you add the field to the Field collection entity, because that deletes all the existing values in the field's instances.

Following these steps, I was able to migrate the multigroups to field collections correctly. I did experience a minor issue, which seemed strange to me. After running the script I still saw the old values when I refreshed the node edit page that I had loaded before running the code, and I needed to clear cache to see the migrated values. I did not expect a node edit form to be cached, but other than that, the site performed well afterward.

The full code can be downloaded here. I welcome any feedback regarding the code, or your experience doing a similar migration.

24 comments

I used this module in D7 today on a project where I need to add image captions, image credits and images as one set of associated data. This module allows you to (as you explain so well in your post):

1) Set up the initial Field Collection for the set

2) Add the caption, credit and image upload fields to the Field Collection

3) Set the field collection to take unlimited values so you can continue to upload multiple images to a node but keep all the data associated.

It's also really nice to be able to add either existing content type fields or add brand new ones directly to the Field collection field via the field collection UI. I love the fact that it's a separate interface from the main content type UI. But the UIs are similar (almost exact) in how they look and work. But still nice that they are separated.

Amazing module and really opens up a lot of possibilities - now I need to confirm that all of this will work with Views! I'm hoping it does.

Great write-up & description of the Field Collection module and model and also great to see the migration path from CCK multigroup in Drupal 6.

Thanks for sharing this.

Best-

Trevor James

by Matt (not verified) on Fri, 07/27/2012 - 6:15am

Thanks for this writeup. I just did an upgrade from 6 to 7 and had to migrate all the data manually as some of the fields were changed and new fields were added. This was really helpful.

I don't have very (and I mean *very*) limited knowledge of PHP and am stuck at step 4 and 5. Do you have any direction as to how we should create our own modules? (A page, a tutorial, a supplement to this.) Right now in the CCK issue queue this blog post is the only solution for the CCK multigroup upgrade problem.

Any insights would be greatly appreciated! :)

-Ryan

Gergely Lekli's picture

by Gergely Lekli on Thu, 08/02/2012 - 9:52am

Hi Ryan,

In essence the attached code is custom tailored to my specific situation - that is, the field names and the content type that I used are hard coded in it. If you would like to use it on your site, you will need to change these. This is what step 4 is about. All you need to do is grab the code, edit it and in line 3-8 replace
- 'my_content_type' with the name of your content type
- 'field_event_collection' with the name of the field collection you created
- 'field_date' and 'field_location' with the names of fields that you have in the field collection. If you have more that two, add them all here.

Here are some resources for creating a module in D7:
http://drupal.org/node/361112
http://www.packtpub.com/article/creating-your-first-module-drupal-7-modu...

The simplest solution would be this:
- create a module directory with a .info file in accordance with the guides above
- create a .module file in the module directory
- copy the attached code into the .module file, and rename the function so that its name starts with the name of your module for consistency
- define a menu item in the .module file using hook_menu with its 'page callback' pointing to the function you copied over

That way, you can run the conversion code by pointing your browser to the path that your menu item defines. (Be careful if you go this way, though; it will be very easy to run the code accidentally as visiting the url you define in hook_menu runs the code right away.)

by Ryan (not verified) on Wed, 08/08/2012 - 9:40am

First of all, thank you very much for this blog and your elaborate detail and actually providing resources for helping me and others like me out!

Though one quick question, when one runs the module, is it normal to get a 403 error?

I don’t want to bother you with these questions so I promise this is the last one!

Gergely Lekli's picture

by Gergely Lekli on Wed, 08/08/2012 - 10:02am

This depends entirely on how you have set up your module. Without knowing how you are trying to implement it, my best guess would be that either the 'access arguments' parameter is missing in your menu definition, or you have an error in your 'access callback' function. Try adding this to your menu definition array:

'access arguments' => array('access administration pages'),

Then clear the cache, and try to access the page as admin.

by Ryan (not verified) on Thu, 08/09/2012 - 6:55am

Thank so much for helping me with getting rid of that 403 error! Like I said I don’t have a lot of experience with php or Drupal API in general so this is very difficult for me.

I am still having trouble with the script but I did discover a mistake that I was making when I was setting up the field collection. I kept trying to set up new fields in the Field Collection and found it strange that it wouldn’t let me create new fields with the same names as the original fields in the node… After reading through the process again and examining your description of the scripts methodology for populating the Field Collections I realized the mistake that I was making. **When adding Fields to the Field Collection use the use Existing Field dropdown instead of creating brand new fields.** I kept searching for the variable in the script that defines the names of the destination fields, which didn’t exist for a reason… It was total “what the heck was I thinking?” moment.

So I’ve attempted run the script on a migration of 3 fields to no success it isn’t the custom module not being run from the URL because I put an echo in just before the last bracket of the function to see if Drupal was executing it. (I’ve tried it with and without the echo.) Now when I run it goes to an un-themed empty page with result of the echo, or without the echo, depending on if I included it or not.

Was this script intended to run with 3 or more variables in the definitions of the initial array? It is possible that a Drupal 7 or module update broke the script?

Ideally if I get this working with your permission and attribution I would like to contribute this path to the CCK documentation.

Please don't feel like you have to reply... I know I promised no more questions in the last comment I left. :) Though I do sincerely appreciate your time.

Gergely Lekli's picture

by Gergely Lekli on Thu, 08/09/2012 - 9:44am

Yes, you should list all the fields you have in the $multigroup_fields array, however many it is. Although it is possible, I find it unlikely that a Drupal update affected this script, because it does not rely on too many parts of Drupal API.
There are quite a few things that could be going wrong, but I can't think of anything that could be specific to your situation, based on what I know.

by Ryan (not verified) on Fri, 08/10/2012 - 10:47am

Hello Gergely;

In theory do you think that this script could be run by the Devel Execute PHP Code page? (Just a suggestion because it seems like it would be easier to execute it from Devel than create a whole new module.)

-Ryan

Gergely Lekli's picture

by Gergely Lekli on Fri, 08/10/2012 - 1:05pm

Yes, it should work fine. My code is wrapped in a function definition, so make sure you actually call it by doing

migrate_multigroup_to_collections();

by Ryan (not verified) on Tue, 08/14/2012 - 7:20am

Hi Gergely;

First of all I would like to thank you for all of your time and effort going into my comments I greatly appreciate it.

Sadly I have spent more time on this than it would have been for me to manually convert all of my fields over by hand so I am just going to do it all by hand.

I did probe the script a bit by adding echoes and counts into the script (after it didn't work for me the first time) and it looks like the script is not selecting any rows for me in the first part of the function after the variable definitions.

Anyways, thanks for everything you've done here I am sure someone will someday use this to produce a more concrete upgrade path.

Thanks;

-Ryan

Gergely Lekli's picture

by Gergely Lekli on Tue, 08/14/2012 - 9:08am

I am sorry to hear that this method did not work out for you. Good luck with the site migration.

by Gregg Marshall (not verified) on Sun, 10/07/2012 - 10:10am

I think you may have been bitten by the same thing I did, a problem with Content Migrate in CCK. Take a look at my other comment for my full description.

by Gregg Marshall (not verified) on Sun, 10/07/2012 - 9:00am

I ran into a known issue with Content Migrate that affected using this method.

Content Migrate as downloaded fails to put anything into the converted field's bundle database field, as indicated in the issue http://drupal.org/node/1649306. As a result this code's first select will fail to return any records.

If you apply the patch in that issue (I had to apply it manually because the line numbers have shifted slightly, but it is only adding one line of code), then follow this procedure, and finally clear the caches, the upgrade path works exactly as planned.

Thanks for a great article and saving me a bunch of time figuring out how to upgrade my multigroups.

by Ryan (not verified) on Wed, 10/10/2012 - 10:53am

That worked like a charm! Thank you so much for replying and pointing this out! Another issue I had while doing this was to make sure to set the field collection settings to Unlimited entries AND when adding existing fields to the field collection making sure to NOT click on save when Drupal shows you the field settings page.

Thanks again!

-Ryan

by Ryan (not verified) on Wed, 01/09/2013 - 12:09pm

Hello;

I just attempted to use this method again, however I kept getting errors. I discovered that this script is not compatible with field_collection 7.x-1.0-beta5. It only works with field_collection 7.x-1.0-beta4 or lower. This is because of the added ability in Beta 6 for field collections to have different revisions.

Just an FYI for all out there who is getting an unspecified error that when one looks at the log discovers it is an SQL error because there is not default value provided for the revision_id.

-Ryan

by Anita (not verified) on Thu, 02/14/2013 - 10:41am

I have updated the script to handle multigroup to field collections to handle field collections support of revisions (beta5). Here it is:

UPDATE: Code removed.

Gergely Lekli's picture

by Gergely Lekli on Thu, 02/14/2013 - 11:16am

Thanks for this update.

For easier readability, I removed the code from the comment, and rolled it into a file:
http://blog.urbaninsight.com/files/migrate_multigroup_update_by_Anita_02...

I have not tested it, but it looks promising.

by Shaun (not verified) on Fri, 04/19/2013 - 10:04am

(Thanks for everyone's work on this!)

I'm getting a database error when running the updated script: PDOException: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'mix_dev.field_data_' doesn't exist: SELECT DISTINCT entity_id AS nid, revision_id AS vid FROM {field_data_} field_data_ WHERE (entity_type = :db_condition_placeholder_0) AND (bundle IS NULL ) ; Array ( [:db_condition_placeholder_0] => node ) in migrate_multigroup_to_collections()

Gergely Lekli's picture

by Gergely Lekli on Fri, 04/19/2013 - 10:16am

This appears to be caused by incorrect arguments being passed to the function migrate_multigroup_to_collections().
Double check that the variables that you pass in as arguments ($content_type, $collection_field, and $multigroup_fields) contain correct data that describe your multigroups, as illustrated in the first line of the attachment in the previous comment.

by Shaun (not verified) on Fri, 04/19/2013 - 10:20am

Hi Gergely-Thanks for the response. My assumption was also that the variables were incorrect but they are actually correct..

For instance in the multigroup_fields array I list 'field_ingredient' and sure enough field_data_field_ingredient is in the database as expected.

by James (not verified) on Tue, 04/30/2013 - 9:34am

I got this error when I forgot to build the new (destination) field collection first.

by James (not verified) on Tue, 04/30/2013 - 1:13pm

First: Awesome code, you saved me 2 or 3 days of brute force migration!

The gotcha: Remember to rename the functions if you end up creating several of these little modules. There are two places to change in the second version of the code, lines 1 and 3.

Thanks again!

by Corey (not verified) on Wed, 05/08/2013 - 7:41pm

If anyone is interested in an easy way to leverage batch and queue functionality for migrating lots of multigroup field data, consider using a views bulk operation.

http://drupal.org/project/views_bulk_operations

http://pastebin.com/015iZidB

The snippet has some dummy content type and field names that you can replace with your actual names.

I made a simple view that filters to only nodes of content types which I know have multigroup fields attached. I then adapted the code from the examples to work for one entity as returned by the view, instead of looping over the array of nodes returned buy the 'Get all the nodes that have value in the multigroup' query. In the snippet I hardcoded a switch statement which switches on the current entity's content type and provides the correct multigroup field and destination field collection names as variables for the rest of the code to work with.

You can then configure the VBO view to use the queue api by clicking 'Enqueue the operation instead of executing it directly' in the field settings for the 'Bulk operations: Content' field on your view. Navigate to your VBO page, paste in the php code (remove the php tags from the snippet first) and patiently wait for 10,000+ nodes to migrate while you work on something else and avoid max execution time exceeded errors.

:)

Post new comment