Link

Intermediate Format

Preparing and generating the intermediate format

The intermediate format has to be generated and structured as files for the migrator application to ingest the data.

  • Each row should have only one record of data, and it should be a valid JSON
  • 1000 rows are required to be present in each file
  • The file should be a text file (.txt), should be gzipped and it should be named using the File Naming convention

File Naming

<TYPE>-<PUBLISHER-NAME>-<5 DIGITS>.txt.gz

name Description
TYPE Type of data that the file holds, in smallcase. Values should be story,section,author,user etc.
PUBLISHER-NAME Publishers name, in smallcase
5 DIGITS 5 digit numbering in sequence, starting from 00001, for each file type

Some examples of good file naming conventions

story-fancynews-00001.txt.gz

user-fancynews-00001.txt.gz

Sample Files

To get you started with creating intermediate files, we have attached some samples, that will be a ready reference

Intermediate Sample Files

The sample files folder contains the following files,

Name Description
story-fancynews-00001.txt.gz Contains samples of basic story
sections-fancynews-00001.txt.gz Contains samples of 3 sections, including a parent section, child section and grandchild section
authors-fancynews-00001.txt.gz Contains samples of 3 authors

Table of contents