@@ -10,7 +10,7 @@ Other keys should not be edited except for testing reasons.
...
@@ -10,7 +10,7 @@ Other keys should not be edited except for testing reasons.
## I- Model Stage I
## I- Model Stage I
Total size required for stage1 (without deleting intermediate data is about 3312GB).
Total size required for stage1 (without deleting intermediate data) is about 3312GB.
### A- Data extraction for intra from vanilla VTM
### A- Data extraction for intra from vanilla VTM
#### 1. Dataset preparation - div2k conversion
#### 1. Dataset preparation - div2k conversion
...
@@ -175,6 +175,8 @@ The flag ``NnlfHopDebugOption`` is also needed at decoder since it forces the us
...
@@ -175,6 +175,8 @@ The flag ``NnlfHopDebugOption`` is also needed at decoder since it forces the us
## II- Model Stage 2
## II- Model Stage 2
Total size required for stage2 (without deleting intermediate data) is about 5TB.
### A- Data extraction
### A- Data extraction
#### 1. Dataset preparation - bvi/tvd conversion
#### 1. Dataset preparation - bvi/tvd conversion
...
@@ -217,22 +219,22 @@ It will generate the cfg files for the dataset and a shell script to encode and
...
@@ -217,22 +219,22 @@ It will generate the cfg files for the dataset and a shell script to encode and
Loop on all sequences to encode, for example:
Loop on all sequences to encode, for example:
```sh
```sh
cd stage2/encdec;
cd stage2/encdec;
for((i=0;i<N1;i++));do
for((i=0;i<90;i++));do
./encode_decode_dataset_tvd.sh $i;
./encode_decode_dataset_tvd.sh $i;
done
done
for((i=0;i<N2;i++));do
for((i=0;i<10;i++));do
./encode_decode_dataset_tvd_valid.sh $i;
./encode_decode_dataset_tvd_valid.sh $i;
done
done
for((i=0;i<N3;i++));do
for((i=0;i<3025;i++));do
./encode_decode_dataset_bvi.sh $i;
./encode_decode_dataset_bvi.sh $i;
done
done
for((i=0;i<N4;i++));do
for((i=0;i<75;i++));do
./encode_decode_dataset_bvi_valid.sh $i;
./encode_decode_dataset_bvi_valid.sh $i;
done
done
```
```
or you can use the script to encode on your cluster. N is the number of sequences (run ./encode_decode_dataset.sh to get the value N).
or you can use the script to encode on your cluster. N is the number of sequences (run ./encode_decode_dataset.sh to get the value N).
** Note ** The size requirement is: about 2.6TB for the dumped data.
** Note ** The size requirement is: about 3.3TB for the dumped data.
#### 4. Create consolidated datasets
#### 4. Create consolidated datasets
...
@@ -244,7 +246,7 @@ done
...
@@ -244,7 +246,7 @@ done
```
```
It will generate a unique dataset for each dataset in ["stage2"]["encdec"]["path"] from all individual datasets in ["stage2"]["encdec_xxx"]["path"]/["dump_dir"] and encoder logs in ["stage2"]["encdec_xxx"]["enc_dir"].
It will generate a unique dataset for each dataset in ["stage2"]["encdec"]["path"] from all individual datasets in ["stage2"]["encdec_xxx"]["path"]/["dump_dir"] and encoder logs in ["stage2"]["encdec_xxx"]["enc_dir"].
** Note ** The size requirement is: about 1.3TB for the datasets.
** Note ** The size requirement is: about 1.6TB for the datasets.