Intermediate Format
Preparing and generating the intermediate format
The intermediate format has to be generated and structured as files for the migrator application to ingest the data.
- Each row should have only one record of data, and it should be a valid JSON
- 1000 rows are required to be present in each file
- The file should be a text file (.txt), should be gzipped and it should be named using the File Naming convention
File Naming
<TYPE>-<PUBLISHER-NAME>-<5 DIGITS>.txt.gz
name | Description |
---|---|
TYPE | Type of data that the file holds, in smallcase. Values should be story ,section ,author ,user etc. |
PUBLISHER-NAME | Publishers name, in smallcase |
5 DIGITS | 5 digit numbering in sequence, starting from 00001, for each file type |
Some examples of good file naming conventions
story-fancynews-00001.txt.gz
user-fancynews-00001.txt.gz
Sample Files
To get you started with creating intermediate files, we have attached some samples, that will be a ready reference
The sample files folder contains the following files,
Name | Description |
---|---|
story-fancynews-00001.txt.gz | Contains samples of basic story |
sections-fancynews-00001.txt.gz | Contains samples of 3 sections, including a parent section, child section and grandchild section |
authors-fancynews-00001.txt.gz | Contains samples of 3 authors |