Harvesting

Harvesting Sources - External systems that Meru is harvesting content from. For example, the URL of an OAIPMH feed. The harvesting source record would contain any protocol details or authentication.
Harvesting Sets - Collections of content within a harvest source. In OAIPMH that would be a set that was exposed by the feed. They could represent a journal, departments, or grouped content in general.
Harvest Mappings - Define the relationship between the content and the external sources and locations in Meru. They have templating logic that explains how we transform a piece of content or data in the harvested data to what we store in Meru.
Harvesting Attempts - An individual event where Meru tries to retrieve and process content from a configured harvest source, based on the mappings and settings you’ve provided.
Harvest Records - Singular flat collection of metadata that could be transformed into one or more entities in Meru. An OAIPMH feed typically doesn’t expose the full hierarchical structure of journal content. It doesn’t tell us the Journal → the volume → the issue → and the article. We just get the article and have to infer the hierarchy. A single harvest record can lead to multiple entities getting created in Meru.

Access Harvesting Dashboard

Log in to the Meru Admin.
To get to the harvesting dashboard, go to Manage in the main navigation and click on Harvesting.

Note The harvesting UI is only available to users with global_admin privileges.
The main page will show a list of your existing Harvest Sources. You’ll see each source’s name, number of harvest sets, harvest records, metadata formats, and the source URL.

Create a New Harvest Source

Click on the Harvest source + button at the top right under the main navigation.
Create a Name for your harvest source.
Add an Identifier (all lowercase). This cannot be changed after the source is created and is used to help reference the source.
Add the source URL to the URL field.
Select your Harvest protocol and Metadata formats. Note These cannot be edited once the harvest source has been saved.
Configure your Extraction mapping template

Choose one of the default presets available in the Examples dropdown.

The extraction mapping template defines how to transform source metadata—such as JATS—into a structured entity hierarchy suitable for harvesting. It acts as a bridge between the raw data and the system’s expected format.

The code editor uses a combination of XML and Liquid. In most cases, the default mapping templates will work without modification. You may only need to customize the template if the source data contains legacy structures or unexpected formatting.
Reasons the default template may need to be updated
- Adding custom fields (e.g., cover images, extra metadata)
- Addressing incomplete or non-standard source metadata
- Adjusting how Meru interprets or organizes incoming data (e.g., refining the hierarchy or entity relationships)
Mapping options should be left alone in most cases, unless cic suggests otherwise for a specific source.

Use metadata mappings allow you to connect specific data patterns to different points in the hierarchy. By default, harvesting sends all data to a single target entity, but data isn’t always organized in the most ideal way. Mappings let you define how certain types of data should be routed, assigning them to the appropriate parent in the tree structure.

Options that are selected will take precedence over what is set elsewhere.
If you would like the default records that can be included per harvest to be different than 80k, set it in the Max records field.
Click Save.

Note Upon harvest source creation, the endpoint is queried to figure out what the sets are, and this continues at regular intervals to make sure the shape of the sets are up to date. These act as a useful reference to use when adding sets to mappings and attempts.

Create a Harvest Mapping

There may be cases where you want more control over harvesting, such as targeting a specific set and directing it to a different entity within Meru, running a harvest on a custom schedule, or using a single source to populate multiple entities. This is especially useful when you don’t want to harvest the entire source or need greater flexibility in how and when content is imported. In these situations, creating a dedicated harvest mapping is the recommended approach.

Note Harvesting is limited to the set level, you can’t target an individual item within a set.

Create a New Target Entity

To start a new harvest mapping within a harvest source, you need to manually add an empty parent entity (e.g. journal or series) and new community if necessary.

Go to your Collections tab in the main navigation. 2. Click on the Add collection + button under the main navigation. 3. Add all of the desired details to your new collection (Community, Title, Visibility, Correct schema, etc.) 4. Once it has been created and saved, go back to Manage → Harvest → [Current harvest source]

Note Once an entity is created, the harvest process will recognize it and overwrite the existing record during future harvests. However, if the entity is deleted, it will be recreated in the next harvest.

Configure Your Harvest Mapping

Now that you have a target entity, you are ready to create your Harvest Mapping.

Click on Harvest mapping +. 2. In the Target entity field, add your newly created entity. 3. If you want to target a specific set, add it to the Harvest set field. (The sets tab on the source has a list of all available sets within this source) 4. Select the correct Metadata format and use the Mapping template to customize your metadata if needed. 5. Mapping options that are selected will take precedence over what is set elsewhere (e.g. on the harvest source level) 6. If you would like the default records that can be included per harvest to be different than 80k, set it in the Max records field. 7. Customize the harvest Schedule. This field accepts a cron expression or human-readable expression. Leave blank if attempts should only be triggered manually. 8. Click Save.

Create a Harvest Attempt

Each attempt may be triggered manually (by an admin) or automatically (on a schedule, if one is defined). During an attempt, Meru queries the source endpoint, fetches available records, applies any extraction and metadata mappings, and creates or updates content in Meru accordingly.

All harvest attempts are logged and viewable under the Attempts tab.

To create an attempt, start within a harvest source and click either the Harvest attempt + or Harvest mapping + button at the top right under the navigation.

Harvest Attempt

The harvest source and metadata formats are unchangeable. 2. If you’d like to harvest a specific set, select it from the Harvest set dropdown. 3. In the Target entity field, add your newly created journal. 4. You can customize your Extraction mapping template if desired. 5. If you want to add any additional notes for this attempt, there is a field at the bottom of the page. 6. Click Save.

After saving the attempt, associated records will begin appearing immediately in the Records tab. You’ll also see a Messages tab, which captures logs, status updates, and any errors or warnings generated during the harvest process. This provides real-time visibility into how your harvest is progressing and helps identify any issues that need attention.

Harvest Mapping

The harvest is triggered automatically based on the mapping’s schedule. In this case, the harvest source, metadata formats, set, and target entity are all preset. (See Create a Harvest Mapping)

Navigate the Harvest Records Tab

After entities have been successfully harvested and created, their status in the Records tab will change from Pending to Active.

Click on any Identifier to view the full details of that individual record.

Note Identifiers are external to Meru—they are provided by the source and not generated within the system. You can use them to cross-reference records in your own systems.

Details

This section displays the full raw metadata as received from the source.

Note If an entity is missing expected properties, check the latest harvest attempt’s metadata here to verify whether the data was present in the original source.

Entities

This section shows which entities were created in Meru based on the incoming metadata. You can drill down into each level of the hierarchy. Selecting an entity will take you to its Manage section, where you can work with that specific item. Admins will also see a list of harvest records associated with the entity within that section.

Navigate the Messages Tab

You’ll find a Messages tab available at multiple levels—including the overall harvest source and within each individual harvest run—each displaying logs and feedback relevant to that specific context.

While Meru allows you to make manual changes, we recommend making corrections and updates in the source system whenever possible. This ensures consistency and avoids discrepancies between harvests. Use the overwrite protection in Meru only when necessary—for example, to preserve manual edits that can’t be reflected in the source.

Use the Debug toggle in the top-right corner to control the level of detail displayed.

Debug on

When Debug is turned on, you’ll see a full stream of processing logs, including detailed updates and any error messages.

Debug off

When Debug is off, the view is filtered to show only key issues—such as errors, fatal errors, and warnings—making it easier to focus on problems that require attention.

Make Changes to Harvested Entities in Meru

Make any desired changes to the entity.
Once an entity is active, navigate to it and click Manage → Details.
Scroll down to the Harvesting section.

By default, any changes made through the admin interface will be preserved during future harvests. They will not be overwritten.

If you want future harvests to overwrite your manual changes, check the box labeled Allow overwriting?

This checkbox is not persistent — it must be checked each time you want to allow harvesting to make changes. This ensures that manual edits are protected by default, and you are explicitly opting in to allow them to be replaced. Note Note that the overwrite setting applies only to the specific entity. If a higher-level entity (like a volume or issue) is locked, harvesting can still proceed and update lower-level entities (like articles) within that hierarchy.