How to easily migrate pages from Drupal 6 CCK content types to Drupal 7 fields using the Feeds module

Feeds icon showing move from Drupal 6 to 7 One of the easiest ways to upgrade between version 6 and version 7 of Drupal is to re-build your site in Drupal 7, and then use the Views Data Export and Feeds XPath Parser modules to move your pages and articles into your new site.

This post shows you the details of setting up both a D6 View you can export and a D7 Feeds importer you can use to migrate that View content.

Update: You might find the code attached to the "Feeds is my Friend" post helpful in supplementing these instructions. And I now recommend using the Features module to make it easy to copy Content Types, Taxonomies, Menus, and Feeds Importers from your test to production sites.

Before you begin

  1. Read other good posts on general updating strategies:
  2. Do a COMPLETE backup of your site (files, database, everything).
  3. Create a SEPARATE Test site to use for this process, so you don't kill your Live, Production site.

Set up your old Drupal 6 site

  1. On your old D6 site, install Views, Views Data Export, CTools & any modules they depend on.
  2. I'm using "Views Data Export", instead of the RSS display that is native to Views, because RSS made it difficult to output the path and Date fields correctly.
  3. Finally, create a separate View for each content type you want to move.

View settings in Drupal 6

settings for default display

  1. Add a view type of "Node"
  2. Set filters for only Published Pages.
  3. Add fields for each bit of data you want to migrate, such as these common fields:
    • Uid
    • Nid
    • Post date
    • Path
    • Title
    • Body
  4. Add a "Data Export" display to the View.
    settings for data export display
    • Under "Style settings", choose "XML file".  For "Data export: Style options", UNCHECK "Provide as file" and DO CHECK "Transform spaces". For Transforming spaces, choose "Dashes", because XML does not process element names if they have spaces in them.
      settings for XML file settings
    • Under "Data export settings", set a "Path" to this new feed. For example: feeds/pages/all.
    • Under "Fields", choose "Node: Path" and, under "Rewriting", check "Rewrite the output of this field". Enter [path] into the the text box, so that you will only get the internal path and not an entire URL.
      settings for Path field rewriting
    • Again under "Fields", choose "Node: Post date" and, under "Rewriting", check "Rewrite the output of this field". Enter [created] into the the text box, so you can change the date formatting. For "Date format", choose "Custom" and enter Y-m-d H:i:s O into the text box, this will output your date in a format that will be easier to import into the new Drupal 7 site.
      settings for Date field custom format
    • Don't link "Title" or "User" fields to their nodes, or they will output link tags in the feed.unchecked link this field to user checkbox
  5. SAVE your View, or all your changes will be LOST!

XML feed output from Drupal 6

When you click on the link to your new Feed (http://mylivesite.gatech.edu/feeds/pages/all), you may see something like this XML code:

<?xml version="1.0" encoding="utf-8" ?>
<nodes>
    <node>
        <Uid>2</Uid>
        <Nid>71</Nid>
        <Path>/about/staff</Path>
        <Post-date>2011-04-28 13:04:39 -0400</Post-date>
        <Title>Our Staff</Title>
        <Body><p>Our employees are brilliant!&nbsp; And attractive, too.</p></Body>
    </node>
    <node>
        <Uid>2</Uid>
        <Nid>81</Nid>
        <Path>/about/location</Path>
        <Post-date>2011-04-28 13:06:27 -0400</Post-date>
        <Title>Our Location and Hours</Title>     <Body><p>More fascinating HTML goes here, including an <a href="http://mysite.gatech.edu/fakedirectory/pagename">absolute link</a> whose URL may need replaced if I am changing my site's Domain Name.</p></Body>
    </node>
</nodes>

 

Set up your test site in Drupal 7

  1. Set up your new D7 site with whatever themes, modules & configurations you would like.
  2. Create custom content types with the same fields as you used in your Drupal 6 site. The Features module will speed up your CCK re-creation, it allows you to import a generic content type with a pre-set collection of fields and settings you often use. 
  3. Warning: pay attention to Text Input Formats or you might strip out important HTML tags from your Body field when importing. So, allow all users (for now) permission to use the "Full HTML" text format. Likewise, set the default text format for your new Page content type to use "Full HTML".
  4. Install Feeds, Feeds XPath Parser, CTools & any modules they depend on.
  5. Warning: the Pathauto module, if enabled, will overwrite/re-create path aliases for all the pages you import, so you might want to disable Pathauto before importing.

Feed importer settings in Drupal 7

  1. Add a feed importer at http://mysite.gatech.edu/admin/structure/feeds.
  2. For "Basic settings", choose:
    basic settings screen
    • Attach to content type: "Use standalone form".
    • Periodic import: "Off"
    • CHECK: Import on submission
  3. For "Fetcher", use "HTTP Fetcher" and choose:
    HTTP fetcher settings screen
    • CHECK: Auto detect feeds
  4. For "Parser", choose "XPath XML parser".
  5. For "Processor", choose "Node processor" and then use these Settings:
    node processor settings screen
    • Update existing nodes: Replace existing nodes
    • Text format: Full HTML
    • Content type: Page
    • Author: YourUserName (Note: To import page authors, you have to import your users BEFORE importing pages).
    • Expire nodes: Never
  6. For "Node processor Mapping", add "XPath Expression" for each of these fields:
    node processor mapping screen
    • "Node ID" and make it Unique
    • "User ID"
    • "Title"
    • "Body"
    • "Published date"
    • "Path alias"
  7. Under "XPath XML parser", type in your XSL queries like this:
    XPath XML parser screen
    • Context: //node
    • nid: Nid
    • uid: Uid
    • title: Title
    • body: Body
    • created: Post-date
    • path_alias: Path
    • At the bottom of the page, do NOT check any boxes under "Select the queries you would like to return raw XML or HTML", as this will wrap your field data in an extra <Body> tag.
  8. Be sure to Save your settings.

Using your feed importer with Drupal 7

import screen

  1. Go to the /import page on your site (for example: http://mysite.gatech.edu/import) & choose the importer you just created (D6 XML pages).
  2. In the Import > URL text box, enter the web address of the feed view you created earlier, for example: http://mylivesite.gatech.edu/feeds/pages/all and click on "Import".
    import screen 2
  3. Hopefully, you'll see a successful Status message that says something like "10 imported items total".
    import screen success status

Quality assurance

  1. Do some sample checking of the pages you imported. Make sure your new pages are identical to those on the old site.
  2. Consider using Views Bulk Operations (VBO) as a great way to add tags or do mass corrections to this imported content in your new Drupal 7 site.

Known import Problems

For more information

Comments

Adelle Frank

Going to update post with your feedback

Thanks, M, for giving such detailed feedback: I will update my post with your excellent additions!

Adelle

Date field Timezone problem

There is a known issue for Feeds, where it converts imported dates to GMT, even if they are in a UNIX timestamp format.
There are a number of workarounds right now, including the Feeds Tamper module, or setting your Content Type's date field to "UTC". 

Dawn

Great resource! Attached files and images?

HI there - thank you for a very clear tutorial! 
I'm wondering how you would handle adding multiple attached files? 
I'm currently using Data Export module in D6 and it only outputs attached files as a comma separated list. Which would be fine if I could figure out how to get Feeds Xpath Parser in D7 to import these. I outlined the problem here http://drupal.org/node/1874380.
Is there a standard way to attach and migrate files using feeds?
thank you

Adelle Frank

Try Feeds Tamper module

Hi, Dawn:

I don't think there is a standard way to migrate attached files.  I believe this should work, but I haven't had time to test.  If you do try it would you let me know if it works?

  1. In your Drupal 6 View, add a field from the Uploads category, called "Upload: Attached files"
  2. When configuring that field, choose "Rewrite the output of this field" and put the [upload_fid-url] token into the textarea, so that the absolute link to this file is output in the XML feed. Be sure to choose a comma WITHOUT a space as the separator for multiple values.
  3. In your Drupal 7 site, make sure your content type for Page allows more than one value for the attached files field & that your field allows the extensions (txt, png, etc) you will be migrating.
  4. In your Feed importer, add a mapping for the Attached File field.
  5. Under the Settings for XPath XML parser page, enter "Attached-files".
  6. Install the Feeds Tamper module, and configure it to explode multiple values.

Adelle Frank

Text Find & Replace Solution (for small amounts of data)

Because Views Export as XML does NOT nest multivalued field values properly in parent/child tags, my co-worker, Matt came up with this Solution:

Use a text editor to replace <elements> in the exported XML. You can give those fields and their multiple values a parent/child structure manually.

Sean

More Resources

Hey,
I found this resource when looking into the same problem.  The only difference is the example they run through is done with a CSV file.  The process is identical.
 
http://prezi.com/g6chbltwdcbx/drupal-data-migration-made-simple-with-feeds/

Adelle Frank

Great resource!

Thanks, Sean :)