ID: 13253

Print Friendly, PDF & Email

Getting started with Hierarchical Archive

Hierarchical Archive allows multi-level archiving to be implemented in Zoom. An asset can independently move between different tiers of the hierarchy as configured for the system. You need to set up a Hub Server for this to work. The Hub manages the jobs that move assets between tiers and provides reports for the admin users.

 

From Zoom 7.3 onwards, the framework for setting up Hierarchical Archive is installed with Zoom. It is ready to use after we enable the license, configure it in Zoom, and also set up a Hub Server to manage Hierarchical Archive jobs. After setting up the Hierarchical Archive, once an archive/restore job is triggered in Zoom, it runs independently of Zoom processes as it is managed by the Hub Server.

 

Once you evaluate and conclude that you need Hierarchical Archive with your Zoom setup, you need to begin by ensuring that these prerequisites are met:

  1. Ensure you have received an Archive and Hierarchical Archive license from Evolphin Sales.
  2. Designate a server machine inside your network as the Hub. This could also be on the existing Zoom Server or Preview Server machine. This machine also needs Java 8 or higher.
  3. Plan on having an exclusive Zoom user account for the Hub. One Zoom user account per Hub will be needed.
  4. Choose a machine to install SQL Server or use an existing SQL Server for job reporting.

 

After these prerequisites are met, follow on to set up Hierarchical Archive:

 

While enabling Hierarchical Archive we need to also enable Basic Archive on Zoom, if it is not already done. Follow these steps to enable Basic and Hierarchical Archive:

  1. Open the Web Management Console for your Zoom Server.
    ex. http://localhost:8443 or http://<zoomserver>:8443
  2. Log in using your admin credentials.
  3. In the Admin Menu sidebar, click Server Control Panel under the System node.
  4. Click Archive Management on the Server Control Panel page.

  5. Click Enable Archive to enable it.
  6. Specify a local path on the Zoom Server as the Archive Location.
    If using an external archive module, like SGL or S3, the archived files are moved to the external archive systems from this path. 
    Ex. E:\zoom\archive\ or /mnt/Archive on the Zoom MAM Server.
  7. If you are not setting up External Archive with scripts then pre- and post- scripts for archive and restore are optional. You can still set up scripts here to execute before and after the archive/restore operations as needed.
  8. If you are setting up External Archive with scripts, then you need these script files:
    1. Specify the path for Pre-script for Archive from the location where the script installer placed the script files (as described here). For example, if you had specified /home/evolphin as the root path for the script installer, then the path will be /home/evolphin/zoom-deploy/ArchivePreHook/archive-hook.pl.
    2. Similarly, specify the path for Pre-script for Restore from the location of the restore pre-script in the script files on Zoom MAM Server. For example, as per the above example, this path will be /home/evolphin/zoom-deploy/RestorePreHook/restore-hook.pl.
  9. Update Limit on Arguments on Command-line to be 0. This is needed to remove the limit on the number of asset IDs passed in a Zoom command while using archiving.
  10. Click Show Additional Options for Archive / Restore to enable it. This allows Zoom to show Hierarchical Archive options while choosing archive/restore for any asset in the Asset Browser.
  11. Click Save.
  12. You will be prompted to restart the server. Click Yes.
  13. Refresh your web browser.

 

Zoom stores Hub configuration to connect with a Hub Server. These configuration values should be added to Zoom before installing the Hub Server.

Log in to the Web Management Console and click the Hub Settings Panel under the System node in the Admin Menu sidebar.

 

On the Hub Settings Panel, click Add Hub and specify a name for the new Hub. Click Add.

You can also select an existing hub from the list and its settings are loaded in the Hub Settings Panel. When a Hub is selected in the Hub box, click Delete to delete that Hub. Select Yes to confirm the deletion.
For Zoom 7.3, while using multiple Hubs in a multi-location Zoom setup, when you want a Hub to only work for one location then you should name the Hub as that location.


Update the basic connectivity details for the hub, like Host and Port settings. For location-specific settings, also specify the location where the Hub should work.

Once the basic connectivity parameters are specified, enter values for the advanced configuration. It is categorized into two parts namely the Functional Configuration and the Archive/Restore Configuration. See the sections below to know about the values that should be entered for these:

If you are configuring Hub for AI, only Database Spec and Client Whitelisting Settings under Functional Configuration are needed to be set here.

 

Functional Configuration

   

 

Process Control

These are configuration parameters that control how responsive the hub will be and must be filled in based on your assessment of the expected load that the hub is expected to handle. If you are unsure about these then use the default values. 

  • Core Pool Size: Minimum number of threads that will be created by default for job execution.
  • Max Pool Size: Maximum number of threads that can be created for job execution.
  • Queue Blocking Limit: Number of jobs that can be queued in the Hub for processing. 
  • Keep Alive Time: Duration (in minutes) for which resources are kept when the Hub is not servicing any requests; after this duration, the resources are surrendered. 
  • Max Retry Count: Number of times the failed jobs are retried automatically. 
  • Enable Hub Analytics: Flag to turn on or off the Analytics module on the Hub. 

 

Database Spec


The Zoom Server uses SQL for some of its other modules too. If it is already configured and you would like to use the same SQL database for the Hub as well, then simply select the Copy from Zoom option. If you want to create a new or separate SQL database for the Hub, then follow here to download your OS-specific installer of MySQL. Run the installer and follow the instructions. Create a user and provide the host, username, password, driver (com.mysql.cj.jdbc.Driver) details here.

 

Client Whitelisting Setting

Here, you can add the list of IPs from which Hub will entertain requests.

Typically, the Zoom Server sends requests to the Hub. In addition, anyone viewing the Hub Dashboard will also be sending queries and so the IPs of all clients who would need access to the Hub Dashboard must also be added here.

If this is left blank, then the Hub will service requests from all client machines.
 

Archive/Restore Configuration

In this section, the details of all destination tiers are added.

All the paths should be with respect to the Hub Server.
 

FS Tier 1

  1. Default: Comma-separated list of TPM (Third Party Mount point) paths whose assets are expected to be archived using the current Hub.  
  2. Direct Asset Archive DB Mount Path: The default location where the Zoom Server archives the direct-ingest assets to. 
  3. Project-wise TPM Mapping: Project-specific archive destination locations as configured in the Zoom Server archive settings. 

FS Tier 2

  1. Default: Comma-separated list of TPM paths whose assets are expected to be archived using the current Hub.  
  2. Project-wise TPM Mapping: Project-specific archive destination locations for tier-2 archive

S3 Tier 2

Enter the details of the Amazon S3 bucket like the secret key, access key, and region. Here it is also possible to define unique S3 buckets for specific Zoom projects. For projects not assigned a unique bucket, the default one will be used. 

Click Save Hub after configuring the necessary details across various panels.

 

Any change in the Hub settings made via the Web Management Console requires the Hub to be restarted to take effect. Restart the Evolphin Job Hub service to see the changes.
Please be very careful when you try to edit the TPM paths or the Tier details. If you archive assets to a certain S3 bucket or a file-system location, and the configuration is changed after that then there is no way of accessing these assets from the previously-stored location. Thereafter, any attempt to restore assets from the lost configuration will fail with a data loss error. This will not be able to be corrected without a significant amount of manual intervention and copying of data. 

 

The Hub Server is installed to manage jobs for various Zoom modules. It can be used to manage Hierarchical Archive or AI jobs for Zoom.

You can install it on any server in your Zoom network. You need to have the Hub installer from Evolphin Support to proceed.

Hub is installed after Hub configuration has been added into Zoom.

 

Installing the Hub

Extract the shared Hub installer zip to any path on your designated Hub Server machine and follow the steps below to install the Hub:

On Windows

  1. Open Command Line as an administrator.
  2. Navigate to the /bin folder inside the extracted Hub build files.
  3. Run Hub install (this will register Evolphin Job Hub as a service)
  4. Run Hub start (this will start the Hub Server)
    To stop use Hub stop; for restart use Hub restart. To remove this service use Hub remove.

On Linux

  1. Navigate to the ../EJH/bin
  2. Run ./Hub start (this will start Hub server)
    To stop use ./Hub stop, for restart use ./Hub restart and to remove this service use ./Hub remove.

Registering with Zoom

After installing the Hub using the steps above, open the Hub Configuration page using the URL http://[HubIP]:[HubPort]/config/Hub-config in your browser to configure the remaining details and register the Hub with you Zoom Server. The Hub IP and port are the same as configured in the Hub Settings Panel inside the Web Management Console earlier.

Update the basic connectivity details like the Hub Host and Hub Port.

Update the Location with the location of the hub as specified in the Web Management Console’s Hub Settings Panel.

Update the Zoom Server’s Web Management Console’s URL under Webmin URL.

In order to run search queries on Zoom and to perform other operations throughout the course of the archive/restore operations, a dedicated Zoom user account is required. Update that Zoom User and Zoom Password here. 

Enter the Zoom Minimum Support Version by finding out from zm version command on the Command-Line.

Select Enable Location Based Hub if you want this Hub to service archive/restore requests only on the assets that are in the same location as the Hub. If this is not selected, then the Hub will service all assets that need to be processed if it has physical access to them even if they are on a different location.

Click Save.

The Hub is now registered with Zoom. If needed, you can modify these settings here on the Hub Config page and update other Hub configuration parameters through the Zoom Web Management Console.

 

After setting up your Hub, you can see how the hierarchical archive works by following this example.

Zoom Hierarchical Archive Flow

  1. In the Asset Browser or Web Asset Browser, select one or more assets to be archived (Tier 0 to Tier 1) and select Archive from the Action menu. You could also right-click on the selected assets to select Archive from the context menu in the Asset Browser. The Archived assets are identified by the Archive icon on their thumbnails. Check that their Archiving metadata shows that these are on Tier 1. The new Hierarchical Archive window is shown.
  2. Select which resolution/proxy asset you want to archive. (one or more from Hi-Res, Mid-Res, or Direct are shown depending on which of these are supported for the selected assets). 
  3. Select Target Storage Type. Currently, FS (File-System) and S3 (Amazon S3) are supported.
  4. Click Archive. Check the Zoom Archive Info metadata group. The archive status for the corresponding resolution gets updated to Pending Archive and the Native Target Location gets updated to reflect the destination. For the example shown below, the Native Target Location is the Tier 2 S3 bucket. 
  5. Later, when this request is picked up by the Hub for processing, the status changes to Ongoing Archive and a job ID is also assigned to the operation. This job ID can be used for searching in the hub dashboard to get the most recent status of the asset/job.

  6. If the request is invalid for some reason, the status changes to Failed Archive.
  7. Correspondingly, the hub dashboard shows the updated job status. Initially, it shows a job’s status as Pending. At this point, the hub is waiting to scrutinize the assets in the job to estimate the time of completion.
  8. After that the status changes to Not Started, and then proceeds to File Copy Started / External Copy Started, Completed, Failed etc depending on the outcome at each stage. 

  9. After completion of a request, the status gets updated to Completed Archive and the Native Target Location is set to empty once again.

The asset’s chosen proxy is now archived. A similar flow is followed for restoring assets.