Intermediate Format

Preparing and generating the intermediate format

The intermediate format has to be generated and structured as files for the migrator application to ingest the data.

  • Each row should have only one record of data, and it should be a valid JSON
  • 1000 rows are required to be present in each file
  • The file should be a text file (.txt), should be gzipped and it should be named using the File Naming convention

File Naming


name Description
TYPE Type of data that the file holds, in smallcase. Values should be story,section,author,user etc.
PUBLISHER-NAME Publishers name, in smallcase
5 DIGITS 5 digit numbering in sequence, starting from 00001, for each file type

Some examples of good file naming conventions



Sample Files

To get you started with creating intermediate files, we have attached some samples, that will be a ready reference

Intermediate Sample Files

The sample files folder contains the following files,

Name Description
story-fancynews-00001.txt.gz Contains samples of basic story
section-fancynews-00001.txt.gz Contains samples of 3 sections, including a parent section, child section and grandchild section
author-fancynews-00001.txt.gz Contains samples of 3 authors

