Harvest for FAO Fishery Statistics Created 21 Jul 08:34

Stage: completed
Fetched: 21 Jul 08:34
Validated: 21 Jul 08:34
Deltas Created 21 Jul 08:34
Units Normalized: 21 Jul 09:11
Ancestry Built: 21 Jul 08:45
Nodes Matched: 21 Jul 09:02
Names Parsed: 21 Jul 08:46
New Models Stored: 21 Jul 08:45
Indexed: 21 Jul 09:02
Completed: 21 Jul 09:17
Time to Harvest: 1 minute

Harvesting Log

(1304 lines) (showing only the last 1000 lines, see /app/public/data/fao_fishery_stat/process.log for the full file)
[INFO] [2020-06-23 16:55:07] 356 Unmatched nodes (of 13939)! That's too many to output. First 10: Oreochromis andersonii × Oreochromis niloticus (#80122491); Oreochromis aureus × Oreochromis niloticus (#80130975); E. fuscoguttatus × E. lanceolatus (#80125303); Centracanthidae (#80123480); Branchiostegidae (#80123648); Branchiostegidae (#80133616); Centrogeniidae (#80125195); Inermiidae (#80127199); Coiidae (#80127530); Notograptidae (#80129884)
[START] [2020-06-23 16:55:07] update_nodes
[STOP] [2020-06-23 16:55:13] update_nodes
[STOP] [2020-06-23 16:55:13] match_nodes
[START] [2020-06-23 16:55:13] reindex_search
[STOP] [2020-06-23 16:55:31] reindex_search
[START] [2020-06-23 16:55:31] normalize_units
[STOP] [2020-06-23 17:04:11] normalize_units
[START] [2020-06-23 17:04:11] calculate_statistics
[STOP] [2020-06-23 17:04:12] calculate_statistics
[START] [2020-06-23 17:04:12] complete_harvest_instance
[START] [2020-06-23 17:04:12] overall_tsv_creation
[INFO] [2020-06-23 17:04:12] Processing group of 13939 in 2 batches of 10000
[INFO] [2020-06-23 17:05:32] 12078 Traits (unfiltered)...
[INFO] [2020-06-23 17:05:46] 12078 Traits (filtered)...
[INFO] [2020-06-23 17:05:46] 0 Associations (filtered)...
[INFO] [2020-06-23 17:06:45] 108517 metadata added.
[INFO] [2020-06-23 17:06:45] 0 metadata added.
[INFO] [2020-06-23 17:07:50] 5597 Traits (unfiltered)...
[INFO] [2020-06-23 17:08:04] 5597 Traits (filtered)...
[INFO] [2020-06-23 17:08:04] 0 Associations (filtered)...
[INFO] [2020-06-23 17:08:53] 52084 metadata added.
[INFO] [2020-06-23 17:08:53] 0 metadata added.
[INFO] [2020-06-23 17:08:53] Average Time: 108.98
[INFO] [2020-06-23 17:08:53] Total Time: 4m42s
[STOP] [2020-06-23 17:08:53] overall_tsv_creation
[INFO] [2020-06-23 17:08:53] Done. Check your files:
[INFO] [2020-06-23 17:08:53] (13939 lines) /app/public/data/fao_fishery_stat/publish_nodes.tsv
[INFO] [2020-06-23 17:08:53] (50551 lines) /app/public/data/fao_fishery_stat/publish_node_ancestors.tsv
[INFO] [2020-06-23 17:08:54] (13939 lines) /app/public/data/fao_fishery_stat/publish_scientific_names.tsv
[INFO] [2020-06-23 17:08:54] (17676 lines) /app/public/data/fao_fishery_stat/publish_traits.tsv
[INFO] [2020-06-23 17:08:54] (160602 lines) /app/public/data/fao_fishery_stat/publish_metadata.tsv
[STOP] [2020-06-23 17:08:54] complete_harvest_instance
[START] [2020-06-23 17:08:54] completed
[STOP] [2020-06-23 17:08:54] completed
[STOP] [2020-06-23 17:08:54] logged process, took 2392.54
[INFO] [2020-07-17 13:45:12] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-07-17 13:45:14] ## remove_type: ScientificName
[INFO] [2020-07-17 13:45:14] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-17 13:45:16] [13:45:16.077] Removed 13939 Scientificnames
[INFO] [2020-07-17 13:45:16] ## remove_type: Vernacular
[INFO] [2020-07-17 13:45:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:16] [13:45:16.080] Removed 0 Vernaculars
[INFO] [2020-07-17 13:45:16] ## remove_type: Article
[INFO] [2020-07-17 13:45:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:16] [13:45:16.083] Removed 0 Articles
[INFO] [2020-07-17 13:45:16] ## remove_type: Medium
[INFO] [2020-07-17 13:45:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:16] [13:45:16.086] Removed 0 Media
[INFO] [2020-07-17 13:45:16] ## remove_type: Trait
[INFO] [2020-07-17 13:45:16] ++ Calling delete_all on 142926 instances...
[INFO] [2020-07-17 13:45:50] [13:45:50.092] Removed 142926 Traits
[INFO] [2020-07-17 13:45:50] ## remove_type: MetaTrait
[INFO] [2020-07-17 13:45:50] ++ Calling delete_all on 159088 instances...
[INFO] [2020-07-17 13:45:58] [13:45:58.641] Removed 159088 Metatraits
[INFO] [2020-07-17 13:45:58] ## remove_type: OccurrenceMetadatum
[INFO] [2020-07-17 13:45:58] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:58] [13:45:58.680] Removed 0 Occurrencemetadata
[INFO] [2020-07-17 13:45:58] ## remove_type: Assoc
[INFO] [2020-07-17 13:45:58] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:58] [13:45:58.683] Removed 0 Assocs
[INFO] [2020-07-17 13:45:58] ## remove_type: MetaAssoc
[INFO] [2020-07-17 13:45:58] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:58] [13:45:58.685] Removed 0 Metaassocs
[INFO] [2020-07-17 13:45:58] ## remove_type: Identifier
[INFO] [2020-07-17 13:45:58] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:58] [13:45:58.690] Removed 0 Identifiers
[INFO] [2020-07-17 13:45:58] ## remove_type: Reference
[INFO] [2020-07-17 13:45:58] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 13:45:58] [13:45:58.693] Removed 0 References
[INFO] [2020-07-17 13:45:59] Starting batch with ID 80121792...
[INFO] [2020-07-17 13:46:00] Starting batch with ID 80134631...
[INFO] [2020-07-17 13:46:01] Starting batch with ID 80134631...
[INFO] [2020-07-17 13:46:02] Starting batch with ID 80126992...
[INFO] [2020-07-17 13:46:03] Starting batch with ID 80126992...
[INFO] [2020-07-17 13:46:03] ## remove_type: Node
[INFO] [2020-07-17 13:46:03] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-17 13:46:04] [13:46:04.698] Removed 13939 Nodes
[START] [2020-07-17 13:46:10] logged process
[START] [2020-07-17 13:46:10] Creating resource from OpenData
[START] [2020-07-17 13:46:11] logged process
[START] [2020-07-17 13:46:11] Parse meta.xml file and create formats with fields
[STOP] [2020-07-17 13:46:11] Parse meta.xml file and create formats with fields
[STOP] [2020-07-17 13:46:11] Creating resource from OpenData
[START] [2020-07-17 13:46:11] logged process
[START] [2020-07-17 13:46:11] create_harvest_instance
[STOP] [2020-07-17 13:46:12] create_harvest_instance
[START] [2020-07-17 13:46:12] fetch_files
[STOP] [2020-07-17 13:46:12] fetch_files
[START] [2020-07-17 13:46:12] validate_each_file
[STOP] [2020-07-17 13:46:18] validate_each_file
[START] [2020-07-17 13:46:18] convert_to_csv
[CMD] [2020-07-17 13:46:18] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_nodes_21881.csv > /app/public/converted_csv/fao_fishery_stat_nodes_21881.csv_sorted
[CMD] [2020-07-17 13:46:18] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_occurrences_21882.csv > /app/public/converted_csv/fao_fishery_stat_occurrences_21882.csv_sorted
[CMD] [2020-07-17 13:46:18] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_measurements_21883.csv > /app/public/converted_csv/fao_fishery_stat_measurements_21883.csv_sorted
[STOP] [2020-07-17 13:46:18] convert_to_csv
[START] [2020-07-17 13:46:19] calculate_delta
[CMD] [2020-07-17 13:46:19] echo "0a" > /app/public/diff/fao_fishery_stat_nodes_21881.diff
[CMD] [2020-07-17 13:46:19] tail -n +1 /app/public/converted_csv/fao_fishery_stat_nodes_21881.csv >> /app/public/diff/fao_fishery_stat_nodes_21881.diff
[CMD] [2020-07-17 13:46:19] echo "." >> /app/public/diff/fao_fishery_stat_nodes_21881.diff
[CMD] [2020-07-17 13:46:19] echo "0a" > /app/public/diff/fao_fishery_stat_occurrences_21882.diff
[CMD] [2020-07-17 13:46:19] tail -n +1 /app/public/converted_csv/fao_fishery_stat_occurrences_21882.csv >> /app/public/diff/fao_fishery_stat_occurrences_21882.diff
[CMD] [2020-07-17 13:46:19] echo "." >> /app/public/diff/fao_fishery_stat_occurrences_21882.diff
[CMD] [2020-07-17 13:46:19] echo "0a" > /app/public/diff/fao_fishery_stat_measurements_21883.diff
[CMD] [2020-07-17 13:46:19] tail -n +1 /app/public/converted_csv/fao_fishery_stat_measurements_21883.csv >> /app/public/diff/fao_fishery_stat_measurements_21883.diff
[CMD] [2020-07-17 13:46:19] echo "." >> /app/public/diff/fao_fishery_stat_measurements_21883.diff
[STOP] [2020-07-17 13:46:19] calculate_delta
[START] [2020-07-17 13:46:19] parse_diff_and_store
[INFO] [2020-07-17 13:46:19] Loading nodes diff file into memory (true lines)...
[INFO] [2020-07-17 13:46:22] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-07-17 13:46:22] Loading measurements diff file into memory (true lines)...
[INFO] [2020-07-17 13:56:24] Storing 13939 ScientificNames
[INFO] [2020-07-17 13:56:24] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-17 13:56:29] Average Time: 0.351
[INFO] [2020-07-17 13:56:29] Total Time: 5s
[INFO] [2020-07-17 13:56:29] last 3 / first 3: 0.94
[INFO] [2020-07-17 13:56:29] Std.Dev: 0.03162277660168379; Max: 0.39
[INFO] [2020-07-17 13:56:29] Storing 13939 Nodes
[INFO] [2020-07-17 13:56:29] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-17 13:56:35] Average Time: 0.429
[INFO] [2020-07-17 13:56:35] Total Time: 7s
[INFO] [2020-07-17 13:56:35] last 3 / first 3: 0.57
[INFO] [2020-07-17 13:56:35] Std.Dev: 0.4289522117905443; Max: 1.89
[INFO] [2020-07-17 13:56:35] Storing 2269 Occurrences
[INFO] [2020-07-17 13:56:35] Processing group of 2269 in 3 groups of 1000
[INFO] [2020-07-17 13:56:35] Average Time: 0.08
[INFO] [2020-07-17 13:56:35] Total Time: 1s
[INFO] [2020-07-17 13:56:35] Storing 151141 Traits
[INFO] [2020-07-17 13:56:35] Processing group of 151141 in 152 groups of 1000
[INFO] [2020-07-17 13:57:18] Average Time: 0.27
[INFO] [2020-07-17 13:57:18] Total Time: 43s
[INFO] [2020-07-17 13:57:18] last 3 / first 3: 0.77
[INFO] [2020-07-17 13:57:18] Std.Dev: 0.130384048104053; Max: 1.71
[INFO] [2020-07-17 13:57:18] Storing 163107 MetaTraits
[INFO] [2020-07-17 13:57:18] Processing group of 163107 in 164 groups of 1000
[INFO] [2020-07-17 13:57:34] Average Time: 0.093
[INFO] [2020-07-17 13:57:34] Total Time: 16s
[INFO] [2020-07-17 13:57:34] last 3 / first 3: 0.6
[INFO] [2020-07-17 13:57:34] Std.Dev: 0.0; Max: 0.27
[STOP] [2020-07-17 13:57:34] parse_diff_and_store
[START] [2020-07-17 13:57:34] resolve_keys
[INFO] [2020-07-17 13:57:50] Occurrences to nodes (through scientific_names)...
[INFO] [2020-07-17 13:57:50] traits to occurrences...
[INFO] [2020-07-17 13:57:51] traits to nodes (through occurrences)...
[INFO] [2020-07-17 13:57:52] Traits to sex term...
[INFO] [2020-07-17 13:57:52] Traits to lifestage term...
[INFO] [2020-07-17 13:57:52] MetaTraits to traits...
[INFO] [2020-07-17 13:58:03] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-07-17 13:58:09] Assocs to occurrences...
[INFO] [2020-07-17 13:58:09] Assocs to nodes...
[INFO] [2020-07-17 13:58:09] Assoc to sex term...
[INFO] [2020-07-17 13:58:09] Assoc to lifestage term...
[STOP] [2020-07-17 13:58:09] resolve_keys
[START] [2020-07-17 13:58:09] hold_for_later_1
[STOP] [2020-07-17 13:58:09] hold_for_later_1
[START] [2020-07-17 13:58:09] hold_for_later_2
[STOP] [2020-07-17 13:58:09] hold_for_later_2
[START] [2020-07-17 13:58:09] resolve_missing_parents
[STOP] [2020-07-17 13:58:11] resolve_missing_parents
[START] [2020-07-17 13:58:11] rebuild_nodes
[START] [2020-07-17 13:58:11] Flattener#flatten
[START] [2020-07-17 13:58:11] Flattener#study_resource
[START] [2020-07-17 13:58:11] Flattener#build_ancestry
[STOP] [2020-07-17 13:58:14] Flattener#build_ancestry
[INFO] [2020-07-17 13:58:14] 13939 ancestry keys
[START] [2020-07-17 13:58:14] build_node_ancestors
[INFO] [2020-07-17 13:58:14] old ancestors deleted.
[STOP] [2020-07-17 13:58:16] build_node_ancestors
[START] [2020-07-17 13:58:19] Flattener#propagate_ancestor_ids
[STOP] [2020-07-17 13:58:20] Flattener#propagate_ancestor_ids
[STOP] [2020-07-17 13:58:20] Flattener#flatten
[STOP] [2020-07-17 13:58:20] rebuild_nodes
[START] [2020-07-17 13:58:20] resolve_missing_media_owners
[STOP] [2020-07-17 13:58:20] resolve_missing_media_owners
[START] [2020-07-17 13:58:20] sanitize_media_verbatims
[STOP] [2020-07-17 13:58:20] sanitize_media_verbatims
[START] [2020-07-17 13:58:20] queue_downloads
[STOP] [2020-07-17 13:58:20] queue_downloads
[START] [2020-07-17 13:58:20] parse_names
[WARN] [2020-07-17 13:58:20] I see 13939 names which still need to be parsed.
[WARN] [2020-07-17 13:58:31] I see 230 names which still need to be parsed.
[WARN] [2020-07-17 13:58:32] I see 8 names which still need to be parsed.
[WARN] [2020-07-17 13:58:33] I see 3 names which still need to be parsed.
[WARN] [2020-07-17 13:58:34] I see 2 names which still need to be parsed.
[WARN] [2020-07-17 13:58:35] I see 1 names which still need to be parsed.
[STOP] [2020-07-17 13:58:37] parse_names
[START] [2020-07-17 13:58:37] denormalize_canonical_names_to_nodes
[STOP] [2020-07-17 13:58:37] denormalize_canonical_names_to_nodes
[START] [2020-07-17 13:58:37] match_nodes
[START] [2020-07-17 13:58:37] map_all_nodes_to_pages
[STOP] [2020-07-17 14:09:41] map_all_nodes_to_pages
[INFO] [2020-07-17 14:09:41] 356 Unmatched nodes (of 13939)! That's too many to output. First 10: Oreochromis andersonii × Oreochromis niloticus (#80560885); Oreochromis aureus × Oreochromis niloticus (#80569369); E. fuscoguttatus × E. lanceolatus (#80563697); Centracanthidae (#80561874); Branchiostegidae (#80562042); Branchiostegidae (#80572010); Centrogeniidae (#80563589); Inermiidae (#80565593); Coiidae (#80565924); Notograptidae (#80568278)
[START] [2020-07-17 14:09:41] update_nodes
[STOP] [2020-07-17 14:09:46] update_nodes
[STOP] [2020-07-17 14:09:46] match_nodes
[START] [2020-07-17 14:09:46] reindex_search
[STOP] [2020-07-17 14:10:02] reindex_search
[START] [2020-07-17 14:10:02] normalize_units
[STOP] [2020-07-17 14:18:37] normalize_units
[START] [2020-07-17 14:18:37] calculate_statistics
[STOP] [2020-07-17 14:18:37] calculate_statistics
[START] [2020-07-17 14:18:37] complete_harvest_instance
[START] [2020-07-17 14:18:37] overall_tsv_creation
[INFO] [2020-07-17 14:18:37] Processing group of 13939 in 2 batches of 10000
[INFO] [2020-07-17 14:20:00] 10544 Traits (unfiltered)...
[INFO] [2020-07-17 14:21:25] 10544 Traits (filtered)...
[INFO] [2020-07-17 14:21:28] 0 Associations (filtered)...
[INFO] [2020-07-17 14:21:35] 82116 metadata added.
[INFO] [2020-07-17 14:21:35] 0 metadata added.
[INFO] [2020-07-17 14:22:37] 5016 Traits (unfiltered)...
[INFO] [2020-07-17 14:23:38] 5016 Traits (filtered)...
[INFO] [2020-07-17 14:23:41] 0 Associations (filtered)...
[INFO] [2020-07-17 14:23:43] 41000 metadata added.
[INFO] [2020-07-17 14:23:43] 0 metadata added.
[INFO] [2020-07-17 14:23:44] Average Time: 120.74
[INFO] [2020-07-17 14:23:44] Total Time: 5m7s
[STOP] [2020-07-17 14:23:44] overall_tsv_creation
[INFO] [2020-07-17 14:23:44] Done. Check your files:
[INFO] [2020-07-17 14:23:44] (13939 lines) /app/public/data/fao_fishery_stat/publish_nodes.tsv
[INFO] [2020-07-17 14:23:44] (50551 lines) /app/public/data/fao_fishery_stat/publish_node_ancestors.tsv
[INFO] [2020-07-17 14:23:44] (13939 lines) /app/public/data/fao_fishery_stat/publish_scientific_names.tsv
[INFO] [2020-07-17 14:23:44] (15561 lines) /app/public/data/fao_fishery_stat/publish_traits.tsv
[INFO] [2020-07-17 14:23:44] (109638 lines) /app/public/data/fao_fishery_stat/publish_metadata.tsv
[STOP] [2020-07-17 14:23:44] complete_harvest_instance
[START] [2020-07-17 14:23:44] completed
[STOP] [2020-07-17 14:23:44] completed
[STOP] [2020-07-17 14:23:44] logged process, took 2253.46
[INFO] [2020-07-17 17:24:15] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-07-17 17:24:18] ## remove_type: ScientificName
[INFO] [2020-07-17 17:24:18] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-17 17:24:20] [17:24:20.366] Removed 13939 Scientificnames
[INFO] [2020-07-17 17:24:20] ## remove_type: Vernacular
[INFO] [2020-07-17 17:24:20] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:24:20] [17:24:20.369] Removed 0 Vernaculars
[INFO] [2020-07-17 17:24:20] ## remove_type: Article
[INFO] [2020-07-17 17:24:20] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:24:20] [17:24:20.372] Removed 0 Articles
[INFO] [2020-07-17 17:24:20] ## remove_type: Medium
[INFO] [2020-07-17 17:24:20] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:24:20] [17:24:20.376] Removed 0 Media
[INFO] [2020-07-17 17:24:20] ## remove_type: Trait
[INFO] [2020-07-17 17:24:20] ++ Calling delete_all on 151141 instances...
[INFO] [2020-07-17 17:24:55] [17:24:55.612] Removed 151141 Traits
[INFO] [2020-07-17 17:24:55] ## remove_type: MetaTrait
[INFO] [2020-07-17 17:24:55] ++ Calling delete_all on 163107 instances...
[INFO] [2020-07-17 17:25:02] [17:25:02.608] Removed 163107 Metatraits
[INFO] [2020-07-17 17:25:02] ## remove_type: OccurrenceMetadatum
[INFO] [2020-07-17 17:25:02] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:25:02] [17:25:02.612] Removed 0 Occurrencemetadata
[INFO] [2020-07-17 17:25:02] ## remove_type: Assoc
[INFO] [2020-07-17 17:25:02] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:25:02] [17:25:02.614] Removed 0 Assocs
[INFO] [2020-07-17 17:25:02] ## remove_type: MetaAssoc
[INFO] [2020-07-17 17:25:02] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:25:02] [17:25:02.617] Removed 0 Metaassocs
[INFO] [2020-07-17 17:25:02] ## remove_type: Identifier
[INFO] [2020-07-17 17:25:02] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:25:02] [17:25:02.619] Removed 0 Identifiers
[INFO] [2020-07-17 17:25:02] ## remove_type: Reference
[INFO] [2020-07-17 17:25:02] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-17 17:25:02] [17:25:02.622] Removed 0 References
[INFO] [2020-07-17 17:25:02] Starting batch with ID 80560009...
[INFO] [2020-07-17 17:25:03] Starting batch with ID 80560009...
[INFO] [2020-07-17 17:25:04] Starting batch with ID 80571146...
[INFO] [2020-07-17 17:25:05] Starting batch with ID 80571146...
[INFO] [2020-07-17 17:25:05] Starting batch with ID 80572051...
[INFO] [2020-07-17 17:25:06] Starting batch with ID 80572051...
[INFO] [2020-07-17 17:25:07] Starting batch with ID 80572051...
[INFO] [2020-07-17 17:25:07] ## remove_type: Node
[INFO] [2020-07-17 17:25:07] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-17 17:25:08] [17:25:08.673] Removed 13939 Nodes
[START] [2020-07-17 17:25:11] logged process
[START] [2020-07-17 17:25:11] Creating resource from OpenData
[START] [2020-07-17 17:25:12] logged process
[START] [2020-07-17 17:25:12] Parse meta.xml file and create formats with fields
[STOP] [2020-07-17 17:25:12] Parse meta.xml file and create formats with fields
[STOP] [2020-07-17 17:25:12] Creating resource from OpenData
[START] [2020-07-17 17:25:12] logged process
[START] [2020-07-17 17:25:12] create_harvest_instance
[STOP] [2020-07-17 17:25:13] create_harvest_instance
[START] [2020-07-17 17:25:13] fetch_files
[STOP] [2020-07-17 17:25:13] fetch_files
[START] [2020-07-17 17:25:13] validate_each_file
[STOP] [2020-07-17 17:25:20] validate_each_file
[START] [2020-07-17 17:25:20] convert_to_csv
[CMD] [2020-07-17 17:25:20] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_nodes_21903.csv > /app/public/converted_csv/fao_fishery_stat_nodes_21903.csv_sorted
[CMD] [2020-07-17 17:25:20] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_occurrences_21904.csv > /app/public/converted_csv/fao_fishery_stat_occurrences_21904.csv_sorted
[CMD] [2020-07-17 17:25:20] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_measurements_21905.csv > /app/public/converted_csv/fao_fishery_stat_measurements_21905.csv_sorted
[STOP] [2020-07-17 17:25:20] convert_to_csv
[START] [2020-07-17 17:25:20] calculate_delta
[CMD] [2020-07-17 17:25:20] echo "0a" > /app/public/diff/fao_fishery_stat_nodes_21903.diff
[CMD] [2020-07-17 17:25:20] tail -n +1 /app/public/converted_csv/fao_fishery_stat_nodes_21903.csv >> /app/public/diff/fao_fishery_stat_nodes_21903.diff
[CMD] [2020-07-17 17:25:20] echo "." >> /app/public/diff/fao_fishery_stat_nodes_21903.diff
[CMD] [2020-07-17 17:25:20] echo "0a" > /app/public/diff/fao_fishery_stat_occurrences_21904.diff
[CMD] [2020-07-17 17:25:20] tail -n +1 /app/public/converted_csv/fao_fishery_stat_occurrences_21904.csv >> /app/public/diff/fao_fishery_stat_occurrences_21904.diff
[CMD] [2020-07-17 17:25:20] echo "." >> /app/public/diff/fao_fishery_stat_occurrences_21904.diff
[CMD] [2020-07-17 17:25:20] echo "0a" > /app/public/diff/fao_fishery_stat_measurements_21905.diff
[CMD] [2020-07-17 17:25:20] tail -n +1 /app/public/converted_csv/fao_fishery_stat_measurements_21905.csv >> /app/public/diff/fao_fishery_stat_measurements_21905.diff
[CMD] [2020-07-17 17:25:20] echo "." >> /app/public/diff/fao_fishery_stat_measurements_21905.diff
[STOP] [2020-07-17 17:25:20] calculate_delta
[START] [2020-07-17 17:25:20] parse_diff_and_store
[INFO] [2020-07-17 17:25:20] Loading nodes diff file into memory (true lines)...
[INFO] [2020-07-17 17:25:23] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-07-17 17:25:24] Loading measurements diff file into memory (true lines)...
[INFO] [2020-07-17 17:35:25] Storing 13939 ScientificNames
[INFO] [2020-07-17 17:35:25] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-17 17:35:31] Average Time: 0.433
[INFO] [2020-07-17 17:35:31] Total Time: 7s
[INFO] [2020-07-17 17:35:31] last 3 / first 3: 1.37
[INFO] [2020-07-17 17:35:31] Std.Dev: 0.08944271909999159; Max: 0.58
[INFO] [2020-07-17 17:35:31] Storing 13939 Nodes
[INFO] [2020-07-17 17:35:31] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-17 17:35:37] Average Time: 0.359
[INFO] [2020-07-17 17:35:37] Total Time: 6s
[INFO] [2020-07-17 17:35:37] last 3 / first 3: 0.41
[INFO] [2020-07-17 17:35:37] Std.Dev: 0.23664319132398465; Max: 1.14
[INFO] [2020-07-17 17:35:37] Storing 2269 Occurrences
[INFO] [2020-07-17 17:35:37] Processing group of 2269 in 3 groups of 1000
[INFO] [2020-07-17 17:35:37] Average Time: 0.077
[INFO] [2020-07-17 17:35:37] Total Time: 1s
[INFO] [2020-07-17 17:35:37] Storing 151141 Traits
[INFO] [2020-07-17 17:35:37] Processing group of 151141 in 152 groups of 1000
[INFO] [2020-07-17 17:36:19] Average Time: 0.276
[INFO] [2020-07-17 17:36:19] Total Time: 43s
[INFO] [2020-07-17 17:36:19] last 3 / first 3: 0.67
[INFO] [2020-07-17 17:36:19] Std.Dev: 0.2469817807045694; Max: 2.53
[INFO] [2020-07-17 17:36:19] Storing 163107 MetaTraits
[INFO] [2020-07-17 17:36:19] Processing group of 163107 in 164 groups of 1000
[INFO] [2020-07-17 17:36:35] Average Time: 0.091
[INFO] [2020-07-17 17:36:35] Total Time: 16s
[INFO] [2020-07-17 17:36:35] last 3 / first 3: 0.61
[INFO] [2020-07-17 17:36:35] Std.Dev: 0.0; Max: 0.25
[STOP] [2020-07-17 17:36:35] parse_diff_and_store
[START] [2020-07-17 17:36:35] resolve_keys
[INFO] [2020-07-17 17:36:47] Occurrences to nodes (through scientific_names)...
[INFO] [2020-07-17 17:36:47] traits to occurrences...
[INFO] [2020-07-17 17:36:48] traits to nodes (through occurrences)...
[INFO] [2020-07-17 17:36:49] Traits to sex term...
[INFO] [2020-07-17 17:36:49] Traits to lifestage term...
[INFO] [2020-07-17 17:36:49] MetaTraits to traits...
[INFO] [2020-07-17 17:36:57] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-07-17 17:37:04] Assocs to occurrences...
[INFO] [2020-07-17 17:37:04] Assocs to nodes...
[INFO] [2020-07-17 17:37:04] Assoc to sex term...
[INFO] [2020-07-17 17:37:04] Assoc to lifestage term...
[STOP] [2020-07-17 17:37:04] resolve_keys
[START] [2020-07-17 17:37:04] hold_for_later_1
[STOP] [2020-07-17 17:37:04] hold_for_later_1
[START] [2020-07-17 17:37:04] hold_for_later_2
[STOP] [2020-07-17 17:37:04] hold_for_later_2
[START] [2020-07-17 17:37:04] resolve_missing_parents
[STOP] [2020-07-17 17:37:05] resolve_missing_parents
[START] [2020-07-17 17:37:05] rebuild_nodes
[START] [2020-07-17 17:37:05] Flattener#flatten
[START] [2020-07-17 17:37:05] Flattener#study_resource
[START] [2020-07-17 17:37:05] Flattener#build_ancestry
[STOP] [2020-07-17 17:37:09] Flattener#build_ancestry
[INFO] [2020-07-17 17:37:09] 13939 ancestry keys
[START] [2020-07-17 17:37:09] build_node_ancestors
[INFO] [2020-07-17 17:37:09] old ancestors deleted.
[STOP] [2020-07-17 17:37:11] build_node_ancestors
[START] [2020-07-17 17:37:14] Flattener#propagate_ancestor_ids
[STOP] [2020-07-17 17:37:15] Flattener#propagate_ancestor_ids
[STOP] [2020-07-17 17:37:15] Flattener#flatten
[STOP] [2020-07-17 17:37:15] rebuild_nodes
[START] [2020-07-17 17:37:15] resolve_missing_media_owners
[STOP] [2020-07-17 17:37:15] resolve_missing_media_owners
[START] [2020-07-17 17:37:15] sanitize_media_verbatims
[STOP] [2020-07-17 17:37:15] sanitize_media_verbatims
[START] [2020-07-17 17:37:15] queue_downloads
[STOP] [2020-07-17 17:37:15] queue_downloads
[START] [2020-07-17 17:37:15] parse_names
[WARN] [2020-07-17 17:37:15] I see 13939 names which still need to be parsed.
[WARN] [2020-07-17 17:37:26] I see 230 names which still need to be parsed.
[WARN] [2020-07-17 17:37:27] I see 8 names which still need to be parsed.
[WARN] [2020-07-17 17:37:28] I see 3 names which still need to be parsed.
[WARN] [2020-07-17 17:37:29] I see 2 names which still need to be parsed.
[WARN] [2020-07-17 17:37:30] I see 1 names which still need to be parsed.
[STOP] [2020-07-17 17:37:32] parse_names
[START] [2020-07-17 17:37:32] denormalize_canonical_names_to_nodes
[STOP] [2020-07-17 17:37:32] denormalize_canonical_names_to_nodes
[START] [2020-07-17 17:37:32] match_nodes
[START] [2020-07-17 17:37:32] map_all_nodes_to_pages
[STOP] [2020-07-17 17:50:29] map_all_nodes_to_pages
[INFO] [2020-07-17 17:50:29] 356 Unmatched nodes (of 13939)! That's too many to output. First 10: Oreochromis andersonii × Oreochromis niloticus (#80577650); Oreochromis aureus × Oreochromis niloticus (#80586134); E. fuscoguttatus × E. lanceolatus (#80580462); Centracanthidae (#80578639); Branchiostegidae (#80578807); Branchiostegidae (#80588775); Centrogeniidae (#80580354); Inermiidae (#80582358); Coiidae (#80582689); Notograptidae (#80585043)
[START] [2020-07-17 17:50:29] update_nodes
[STOP] [2020-07-17 17:50:35] update_nodes
[STOP] [2020-07-17 17:50:35] match_nodes
[START] [2020-07-17 17:50:35] reindex_search
[STOP] [2020-07-17 17:50:53] reindex_search
[START] [2020-07-17 17:50:53] normalize_units
[STOP] [2020-07-17 17:59:34] normalize_units
[START] [2020-07-17 17:59:34] calculate_statistics
[STOP] [2020-07-17 17:59:35] calculate_statistics
[START] [2020-07-17 17:59:35] complete_harvest_instance
[START] [2020-07-17 17:59:35] overall_tsv_creation
[INFO] [2020-07-17 17:59:35] Processing group of 13939 in 2 batches of 10000
[INFO] [2020-07-17 18:00:53] 10544 Traits (unfiltered)...
[INFO] [2020-07-17 18:02:18] 10544 Traits (filtered)...
[INFO] [2020-07-17 18:02:21] 0 Associations (filtered)...
[INFO] [2020-07-17 18:02:26] 83556 metadata added.
[INFO] [2020-07-17 18:02:26] 0 metadata added.
[INFO] [2020-07-17 18:03:30] 5016 Traits (unfiltered)...
[INFO] [2020-07-17 18:04:32] 5016 Traits (filtered)...
[INFO] [2020-07-17 18:04:35] 0 Associations (filtered)...
[INFO] [2020-07-17 18:04:38] 41641 metadata added.
[INFO] [2020-07-17 18:04:38] 0 metadata added.
[INFO] [2020-07-17 18:04:39] Average Time: 121.255
[INFO] [2020-07-17 18:04:39] Total Time: 5m4s
[STOP] [2020-07-17 18:04:39] overall_tsv_creation
[INFO] [2020-07-17 18:04:39] Done. Check your files:
[INFO] [2020-07-17 18:04:39] (13939 lines) /app/public/data/fao_fishery_stat/publish_nodes.tsv
[INFO] [2020-07-17 18:04:39] (50551 lines) /app/public/data/fao_fishery_stat/publish_node_ancestors.tsv
[INFO] [2020-07-17 18:04:39] (13939 lines) /app/public/data/fao_fishery_stat/publish_scientific_names.tsv
[INFO] [2020-07-17 18:04:39] (15561 lines) /app/public/data/fao_fishery_stat/publish_traits.tsv
[INFO] [2020-07-17 18:04:39] (109638 lines) /app/public/data/fao_fishery_stat/publish_metadata.tsv
[STOP] [2020-07-17 18:04:39] complete_harvest_instance
[START] [2020-07-17 18:04:39] completed
[STOP] [2020-07-17 18:04:39] completed
[STOP] [2020-07-17 18:04:39] logged process, took 2366.93
[INFO] [2020-07-20 09:26:03] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-07-20 09:26:03] ## remove_type: ScientificName
[INFO] [2020-07-20 09:26:03] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-20 09:26:05] [09:26:05.388] Removed 13939 Scientificnames
[INFO] [2020-07-20 09:26:05] ## remove_type: Vernacular
[INFO] [2020-07-20 09:26:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:05] [09:26:05.391] Removed 0 Vernaculars
[INFO] [2020-07-20 09:26:05] ## remove_type: Article
[INFO] [2020-07-20 09:26:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:05] [09:26:05.394] Removed 0 Articles
[INFO] [2020-07-20 09:26:05] ## remove_type: Medium
[INFO] [2020-07-20 09:26:05] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:05] [09:26:05.398] Removed 0 Media
[INFO] [2020-07-20 09:26:05] ## remove_type: Trait
[INFO] [2020-07-20 09:26:05] ++ Calling delete_all on 151141 instances...
[INFO] [2020-07-20 09:26:40] [09:26:40.434] Removed 151141 Traits
[INFO] [2020-07-20 09:26:40] ## remove_type: MetaTrait
[INFO] [2020-07-20 09:26:40] ++ Calling delete_all on 163107 instances...
[INFO] [2020-07-20 09:26:48] [09:26:48.030] Removed 163107 Metatraits
[INFO] [2020-07-20 09:26:48] ## remove_type: OccurrenceMetadatum
[INFO] [2020-07-20 09:26:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:48] [09:26:48.033] Removed 0 Occurrencemetadata
[INFO] [2020-07-20 09:26:48] ## remove_type: Assoc
[INFO] [2020-07-20 09:26:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:48] [09:26:48.042] Removed 0 Assocs
[INFO] [2020-07-20 09:26:48] ## remove_type: MetaAssoc
[INFO] [2020-07-20 09:26:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:48] [09:26:48.044] Removed 0 Metaassocs
[INFO] [2020-07-20 09:26:48] ## remove_type: Identifier
[INFO] [2020-07-20 09:26:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:48] [09:26:48.068] Removed 0 Identifiers
[INFO] [2020-07-20 09:26:48] ## remove_type: Reference
[INFO] [2020-07-20 09:26:48] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 09:26:48] [09:26:48.071] Removed 0 References
[INFO] [2020-07-20 09:26:48] Starting batch with ID 80577136...
[INFO] [2020-07-20 09:26:49] Starting batch with ID 80579518...
[INFO] [2020-07-20 09:26:50] Starting batch with ID 80584122...
[INFO] [2020-07-20 09:26:51] Starting batch with ID 80590658...
[INFO] [2020-07-20 09:26:51] Starting batch with ID 80590658...
[INFO] [2020-07-20 09:26:52] Starting batch with ID 80590658...
[INFO] [2020-07-20 09:26:52] ## remove_type: Node
[INFO] [2020-07-20 09:26:52] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-20 09:26:53] [09:26:53.629] Removed 13939 Nodes
[START] [2020-07-20 09:26:56] logged process
[START] [2020-07-20 09:26:56] Creating resource from OpenData
[START] [2020-07-20 09:26:57] logged process
[START] [2020-07-20 09:26:57] Parse meta.xml file and create formats with fields
[STOP] [2020-07-20 09:26:57] Parse meta.xml file and create formats with fields
[STOP] [2020-07-20 09:26:57] Creating resource from OpenData
[START] [2020-07-20 09:26:57] logged process
[START] [2020-07-20 09:26:57] create_harvest_instance
[STOP] [2020-07-20 09:26:58] create_harvest_instance
[START] [2020-07-20 09:26:58] fetch_files
[STOP] [2020-07-20 09:26:58] fetch_files
[START] [2020-07-20 09:26:58] validate_each_file
[STOP] [2020-07-20 09:27:05] validate_each_file
[START] [2020-07-20 09:27:05] convert_to_csv
[CMD] [2020-07-20 09:27:05] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_nodes_21909.csv > /app/public/converted_csv/fao_fishery_stat_nodes_21909.csv_sorted
[CMD] [2020-07-20 09:27:05] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_occurrences_21910.csv > /app/public/converted_csv/fao_fishery_stat_occurrences_21910.csv_sorted
[CMD] [2020-07-20 09:27:05] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_measurements_21911.csv > /app/public/converted_csv/fao_fishery_stat_measurements_21911.csv_sorted
[STOP] [2020-07-20 09:27:05] convert_to_csv
[START] [2020-07-20 09:27:05] calculate_delta
[CMD] [2020-07-20 09:27:05] echo "0a" > /app/public/diff/fao_fishery_stat_nodes_21909.diff
[CMD] [2020-07-20 09:27:05] tail -n +1 /app/public/converted_csv/fao_fishery_stat_nodes_21909.csv >> /app/public/diff/fao_fishery_stat_nodes_21909.diff
[CMD] [2020-07-20 09:27:05] echo "." >> /app/public/diff/fao_fishery_stat_nodes_21909.diff
[CMD] [2020-07-20 09:27:05] echo "0a" > /app/public/diff/fao_fishery_stat_occurrences_21910.diff
[CMD] [2020-07-20 09:27:05] tail -n +1 /app/public/converted_csv/fao_fishery_stat_occurrences_21910.csv >> /app/public/diff/fao_fishery_stat_occurrences_21910.diff
[CMD] [2020-07-20 09:27:05] echo "." >> /app/public/diff/fao_fishery_stat_occurrences_21910.diff
[CMD] [2020-07-20 09:27:05] echo "0a" > /app/public/diff/fao_fishery_stat_measurements_21911.diff
[CMD] [2020-07-20 09:27:05] tail -n +1 /app/public/converted_csv/fao_fishery_stat_measurements_21911.csv >> /app/public/diff/fao_fishery_stat_measurements_21911.diff
[CMD] [2020-07-20 09:27:05] echo "." >> /app/public/diff/fao_fishery_stat_measurements_21911.diff
[STOP] [2020-07-20 09:27:05] calculate_delta
[START] [2020-07-20 09:27:05] parse_diff_and_store
[INFO] [2020-07-20 09:27:05] Loading nodes diff file into memory (true lines)...
[INFO] [2020-07-20 09:27:09] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-07-20 09:27:10] Loading measurements diff file into memory (true lines)...
[INFO] [2020-07-20 09:37:21] Storing 13939 ScientificNames
[INFO] [2020-07-20 09:37:21] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-20 09:37:26] Average Time: 0.345
[INFO] [2020-07-20 09:37:26] Total Time: 5s
[INFO] [2020-07-20 09:37:26] last 3 / first 3: 1.01
[INFO] [2020-07-20 09:37:26] Std.Dev: 0.03162277660168379; Max: 0.44
[INFO] [2020-07-20 09:37:26] Storing 13939 Nodes
[INFO] [2020-07-20 09:37:26] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-20 09:37:30] Average Time: 0.284
[INFO] [2020-07-20 09:37:30] Total Time: 5s
[INFO] [2020-07-20 09:37:30] last 3 / first 3: 1.0
[INFO] [2020-07-20 09:37:30] Std.Dev: 0.03162277660168379; Max: 0.35
[INFO] [2020-07-20 09:37:30] Storing 12418 Occurrences
[INFO] [2020-07-20 09:37:30] Processing group of 12418 in 13 groups of 1000
[INFO] [2020-07-20 09:37:32] Average Time: 0.119
[INFO] [2020-07-20 09:37:32] Total Time: 2s
[INFO] [2020-07-20 09:37:32] last 3 / first 3: 1.21
[INFO] [2020-07-20 09:37:32] Std.Dev: 0.044721359549995794; Max: 0.25
[INFO] [2020-07-20 09:37:32] Storing 151141 Traits
[INFO] [2020-07-20 09:37:32] Processing group of 151141 in 152 groups of 1000
[INFO] [2020-07-20 09:38:15] Average Time: 0.282
[INFO] [2020-07-20 09:38:15] Total Time: 44s
[INFO] [2020-07-20 09:38:15] last 3 / first 3: 0.62
[INFO] [2020-07-20 09:38:15] Std.Dev: 0.21213203435596426; Max: 2.13
[INFO] [2020-07-20 09:38:15] Storing 163107 MetaTraits
[INFO] [2020-07-20 09:38:15] Processing group of 163107 in 164 groups of 1000
[INFO] [2020-07-20 09:38:33] Average Time: 0.109
[INFO] [2020-07-20 09:38:33] Total Time: 19s
[INFO] [2020-07-20 09:38:33] last 3 / first 3: 0.68
[INFO] [2020-07-20 09:38:33] Std.Dev: 0.10488088481701516; Max: 1.41
[STOP] [2020-07-20 09:38:33] parse_diff_and_store
[START] [2020-07-20 09:38:33] resolve_keys
[INFO] [2020-07-20 09:38:46] Occurrences to nodes (through scientific_names)...
[INFO] [2020-07-20 09:38:46] traits to occurrences...
[INFO] [2020-07-20 09:38:48] traits to nodes (through occurrences)...
[INFO] [2020-07-20 09:38:49] Traits to sex term...
[INFO] [2020-07-20 09:38:50] Traits to lifestage term...
[INFO] [2020-07-20 09:38:50] MetaTraits to traits...
[INFO] [2020-07-20 09:39:01] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-07-20 09:39:08] Assocs to occurrences...
[INFO] [2020-07-20 09:39:08] Assocs to nodes...
[INFO] [2020-07-20 09:39:08] Assoc to sex term...
[INFO] [2020-07-20 09:39:08] Assoc to lifestage term...
[STOP] [2020-07-20 09:39:08] resolve_keys
[START] [2020-07-20 09:39:08] hold_for_later_1
[STOP] [2020-07-20 09:39:08] hold_for_later_1
[START] [2020-07-20 09:39:08] hold_for_later_2
[STOP] [2020-07-20 09:39:08] hold_for_later_2
[START] [2020-07-20 09:39:08] resolve_missing_parents
[STOP] [2020-07-20 09:39:09] resolve_missing_parents
[START] [2020-07-20 09:39:09] rebuild_nodes
[START] [2020-07-20 09:39:09] Flattener#flatten
[START] [2020-07-20 09:39:09] Flattener#study_resource
[START] [2020-07-20 09:39:09] Flattener#build_ancestry
[STOP] [2020-07-20 09:39:10] Flattener#build_ancestry
[INFO] [2020-07-20 09:39:10] 13939 ancestry keys
[START] [2020-07-20 09:39:10] build_node_ancestors
[INFO] [2020-07-20 09:39:10] old ancestors deleted.
[STOP] [2020-07-20 09:39:12] build_node_ancestors
[START] [2020-07-20 09:39:15] Flattener#propagate_ancestor_ids
[STOP] [2020-07-20 09:39:16] Flattener#propagate_ancestor_ids
[STOP] [2020-07-20 09:39:16] Flattener#flatten
[STOP] [2020-07-20 09:39:16] rebuild_nodes
[START] [2020-07-20 09:39:16] resolve_missing_media_owners
[STOP] [2020-07-20 09:39:16] resolve_missing_media_owners
[START] [2020-07-20 09:39:16] sanitize_media_verbatims
[STOP] [2020-07-20 09:39:16] sanitize_media_verbatims
[START] [2020-07-20 09:39:16] queue_downloads
[STOP] [2020-07-20 09:39:16] queue_downloads
[START] [2020-07-20 09:39:16] parse_names
[WARN] [2020-07-20 09:39:16] I see 13939 names which still need to be parsed.
[WARN] [2020-07-20 09:39:27] I see 230 names which still need to be parsed.
[WARN] [2020-07-20 09:39:28] I see 8 names which still need to be parsed.
[WARN] [2020-07-20 09:39:29] I see 3 names which still need to be parsed.
[WARN] [2020-07-20 09:39:30] I see 2 names which still need to be parsed.
[WARN] [2020-07-20 09:39:31] I see 1 names which still need to be parsed.
[STOP] [2020-07-20 09:39:33] parse_names
[START] [2020-07-20 09:39:33] denormalize_canonical_names_to_nodes
[STOP] [2020-07-20 09:39:33] denormalize_canonical_names_to_nodes
[START] [2020-07-20 09:39:33] match_nodes
[START] [2020-07-20 09:39:33] map_all_nodes_to_pages
[STOP] [2020-07-20 09:53:21] map_all_nodes_to_pages
[INFO] [2020-07-20 09:53:21] 356 Unmatched nodes (of 13939)! That's too many to output. First 10: Oreochromis andersonii × Oreochromis niloticus (#80591589); Oreochromis aureus × Oreochromis niloticus (#80600073); E. fuscoguttatus × E. lanceolatus (#80594401); Centracanthidae (#80592578); Branchiostegidae (#80592746); Branchiostegidae (#80602714); Centrogeniidae (#80594293); Inermiidae (#80596297); Coiidae (#80596628); Notograptidae (#80598982)
[START] [2020-07-20 09:53:21] update_nodes
[STOP] [2020-07-20 09:53:27] update_nodes
[STOP] [2020-07-20 09:53:27] match_nodes
[START] [2020-07-20 09:53:27] reindex_search
[STOP] [2020-07-20 09:53:44] reindex_search
[START] [2020-07-20 09:53:44] normalize_units
[STOP] [2020-07-20 10:02:27] normalize_units
[START] [2020-07-20 10:02:27] calculate_statistics
[STOP] [2020-07-20 10:02:27] calculate_statistics
[START] [2020-07-20 10:02:27] complete_harvest_instance
[START] [2020-07-20 10:02:27] overall_tsv_creation
[INFO] [2020-07-20 10:02:27] Processing group of 13939 in 2 batches of 10000
[INFO] [2020-07-20 10:04:47] 17861 Traits (unfiltered)...
[INFO] [2020-07-20 10:06:43] 17861 Traits (filtered)...
[INFO] [2020-07-20 10:06:46] 0 Associations (filtered)...
[INFO] [2020-07-20 10:06:53] 90873 metadata added.
[INFO] [2020-07-20 10:06:53] 0 metadata added.
[INFO] [2020-07-20 10:08:06] 8029 Traits (unfiltered)...
[INFO] [2020-07-20 10:09:20] 8029 Traits (filtered)...
[INFO] [2020-07-20 10:09:23] 0 Associations (filtered)...
[INFO] [2020-07-20 10:09:26] 44654 metadata added.
[INFO] [2020-07-20 10:09:26] 0 metadata added.
[INFO] [2020-07-20 10:09:27] Average Time: 144.445
[INFO] [2020-07-20 10:09:27] Total Time: 6m60s
[STOP] [2020-07-20 10:09:27] overall_tsv_creation
[INFO] [2020-07-20 10:09:27] Done. Check your files:
[INFO] [2020-07-20 10:09:27] (13939 lines) /app/public/data/fao_fishery_stat/publish_nodes.tsv
[INFO] [2020-07-20 10:09:27] (50551 lines) /app/public/data/fao_fishery_stat/publish_node_ancestors.tsv
[INFO] [2020-07-20 10:09:27] (13939 lines) /app/public/data/fao_fishery_stat/publish_scientific_names.tsv
[INFO] [2020-07-20 10:09:27] (25891 lines) /app/public/data/fao_fishery_stat/publish_traits.tsv
[INFO] [2020-07-20 10:09:27] (109638 lines) /app/public/data/fao_fishery_stat/publish_metadata.tsv
[STOP] [2020-07-20 10:09:27] complete_harvest_instance
[START] [2020-07-20 10:09:27] completed
[STOP] [2020-07-20 10:09:27] completed
[STOP] [2020-07-20 10:09:27] logged process, took 2550.09
[INFO] [2020-07-20 11:11:31] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-07-20 11:11:33] ## remove_type: ScientificName
[INFO] [2020-07-20 11:11:33] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-20 11:11:35] [11:11:35.141] Removed 13939 Scientificnames
[INFO] [2020-07-20 11:11:35] ## remove_type: Vernacular
[INFO] [2020-07-20 11:11:35] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:11:35] [11:11:35.145] Removed 0 Vernaculars
[INFO] [2020-07-20 11:11:35] ## remove_type: Article
[INFO] [2020-07-20 11:11:35] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:11:35] [11:11:35.148] Removed 0 Articles
[INFO] [2020-07-20 11:11:35] ## remove_type: Medium
[INFO] [2020-07-20 11:11:35] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:11:35] [11:11:35.151] Removed 0 Media
[INFO] [2020-07-20 11:11:35] ## remove_type: Trait
[INFO] [2020-07-20 11:11:35] ++ Calling delete_all on 151141 instances...
[INFO] [2020-07-20 11:12:09] [11:12:09.028] Removed 151141 Traits
[INFO] [2020-07-20 11:12:09] ## remove_type: MetaTrait
[INFO] [2020-07-20 11:12:09] ++ Calling delete_all on 163107 instances...
[INFO] [2020-07-20 11:12:16] [11:12:16.666] Removed 163107 Metatraits
[INFO] [2020-07-20 11:12:16] ## remove_type: OccurrenceMetadatum
[INFO] [2020-07-20 11:12:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:12:16] [11:12:16.669] Removed 0 Occurrencemetadata
[INFO] [2020-07-20 11:12:16] ## remove_type: Assoc
[INFO] [2020-07-20 11:12:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:12:16] [11:12:16.672] Removed 0 Assocs
[INFO] [2020-07-20 11:12:16] ## remove_type: MetaAssoc
[INFO] [2020-07-20 11:12:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:12:16] [11:12:16.674] Removed 0 Metaassocs
[INFO] [2020-07-20 11:12:16] ## remove_type: Identifier
[INFO] [2020-07-20 11:12:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:12:16] [11:12:16.677] Removed 0 Identifiers
[INFO] [2020-07-20 11:12:16] ## remove_type: Reference
[INFO] [2020-07-20 11:12:16] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-20 11:12:16] [11:12:16.679] Removed 0 References
[INFO] [2020-07-20 11:12:17] Starting batch with ID 80590717...
[INFO] [2020-07-20 11:12:18] Starting batch with ID 80604034...
[INFO] [2020-07-20 11:12:19] Starting batch with ID 80599774...
[INFO] [2020-07-20 11:12:20] Starting batch with ID 80597696...
[INFO] [2020-07-20 11:12:20] Starting batch with ID 80597696...
[INFO] [2020-07-20 11:12:21] ## remove_type: Node
[INFO] [2020-07-20 11:12:21] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-20 11:12:22] [11:12:22.345] Removed 13939 Nodes
[START] [2020-07-20 11:12:25] logged process
[START] [2020-07-20 11:12:25] Creating resource from OpenData
[START] [2020-07-20 11:12:26] logged process
[START] [2020-07-20 11:12:26] Parse meta.xml file and create formats with fields
[STOP] [2020-07-20 11:12:26] Parse meta.xml file and create formats with fields
[STOP] [2020-07-20 11:12:26] Creating resource from OpenData
[START] [2020-07-20 11:12:26] logged process
[START] [2020-07-20 11:12:26] create_harvest_instance
[STOP] [2020-07-20 11:12:27] create_harvest_instance
[START] [2020-07-20 11:12:27] fetch_files
[STOP] [2020-07-20 11:12:27] fetch_files
[START] [2020-07-20 11:12:27] validate_each_file
[STOP] [2020-07-20 11:12:34] validate_each_file
[START] [2020-07-20 11:12:34] convert_to_csv
[CMD] [2020-07-20 11:12:34] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_nodes_21915.csv > /app/public/converted_csv/fao_fishery_stat_nodes_21915.csv_sorted
[CMD] [2020-07-20 11:12:34] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_occurrences_21916.csv > /app/public/converted_csv/fao_fishery_stat_occurrences_21916.csv_sorted
[CMD] [2020-07-20 11:12:34] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_measurements_21917.csv > /app/public/converted_csv/fao_fishery_stat_measurements_21917.csv_sorted
[STOP] [2020-07-20 11:12:34] convert_to_csv
[START] [2020-07-20 11:12:34] calculate_delta
[CMD] [2020-07-20 11:12:34] echo "0a" > /app/public/diff/fao_fishery_stat_nodes_21915.diff
[CMD] [2020-07-20 11:12:34] tail -n +1 /app/public/converted_csv/fao_fishery_stat_nodes_21915.csv >> /app/public/diff/fao_fishery_stat_nodes_21915.diff
[CMD] [2020-07-20 11:12:34] echo "." >> /app/public/diff/fao_fishery_stat_nodes_21915.diff
[CMD] [2020-07-20 11:12:34] echo "0a" > /app/public/diff/fao_fishery_stat_occurrences_21916.diff
[CMD] [2020-07-20 11:12:34] tail -n +1 /app/public/converted_csv/fao_fishery_stat_occurrences_21916.csv >> /app/public/diff/fao_fishery_stat_occurrences_21916.diff
[CMD] [2020-07-20 11:12:34] echo "." >> /app/public/diff/fao_fishery_stat_occurrences_21916.diff
[CMD] [2020-07-20 11:12:34] echo "0a" > /app/public/diff/fao_fishery_stat_measurements_21917.diff
[CMD] [2020-07-20 11:12:34] tail -n +1 /app/public/converted_csv/fao_fishery_stat_measurements_21917.csv >> /app/public/diff/fao_fishery_stat_measurements_21917.diff
[CMD] [2020-07-20 11:12:34] echo "." >> /app/public/diff/fao_fishery_stat_measurements_21917.diff
[STOP] [2020-07-20 11:12:34] calculate_delta
[START] [2020-07-20 11:12:34] parse_diff_and_store
[INFO] [2020-07-20 11:12:34] Loading nodes diff file into memory (true lines)...
[INFO] [2020-07-20 11:12:38] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-07-20 11:12:39] Loading measurements diff file into memory (true lines)...
[INFO] [2020-07-20 11:22:44] Storing 13939 ScientificNames
[INFO] [2020-07-20 11:22:44] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-20 11:22:49] Average Time: 0.349
[INFO] [2020-07-20 11:22:49] Total Time: 5s
[INFO] [2020-07-20 11:22:49] last 3 / first 3: 1.07
[INFO] [2020-07-20 11:22:49] Std.Dev: 0.03162277660168379; Max: 0.45
[INFO] [2020-07-20 11:22:49] Storing 13939 Nodes
[INFO] [2020-07-20 11:22:49] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-20 11:22:53] Average Time: 0.29
[INFO] [2020-07-20 11:22:53] Total Time: 5s
[INFO] [2020-07-20 11:22:53] last 3 / first 3: 1.02
[INFO] [2020-07-20 11:22:53] Std.Dev: 0.03162277660168379; Max: 0.39
[INFO] [2020-07-20 11:22:53] Storing 12418 Occurrences
[INFO] [2020-07-20 11:22:53] Processing group of 12418 in 13 groups of 1000
[INFO] [2020-07-20 11:22:55] Average Time: 0.118
[INFO] [2020-07-20 11:22:55] Total Time: 2s
[INFO] [2020-07-20 11:22:55] last 3 / first 3: 0.82
[INFO] [2020-07-20 11:22:55] Std.Dev: 0.03162277660168379; Max: 0.16
[INFO] [2020-07-20 11:22:55] Storing 151141 Traits
[INFO] [2020-07-20 11:22:55] Processing group of 151141 in 152 groups of 1000
[INFO] [2020-07-20 11:23:41] Average Time: 0.299
[INFO] [2020-07-20 11:23:41] Total Time: 47s
[INFO] [2020-07-20 11:23:41] last 3 / first 3: 0.61
[INFO] [2020-07-20 11:23:41] Std.Dev: 0.2569046515733026; Max: 2.52
[INFO] [2020-07-20 11:23:41] Storing 163107 MetaTraits
[INFO] [2020-07-20 11:23:41] Processing group of 163107 in 164 groups of 1000
[INFO] [2020-07-20 11:23:56] Average Time: 0.09
[INFO] [2020-07-20 11:23:56] Total Time: 16s
[INFO] [2020-07-20 11:23:56] last 3 / first 3: 0.66
[INFO] [2020-07-20 11:23:56] Std.Dev: 0.0; Max: 0.25
[STOP] [2020-07-20 11:23:56] parse_diff_and_store
[START] [2020-07-20 11:23:56] resolve_keys
[INFO] [2020-07-20 11:24:08] Occurrences to nodes (through scientific_names)...
[INFO] [2020-07-20 11:24:09] traits to occurrences...
[INFO] [2020-07-20 11:24:11] traits to nodes (through occurrences)...
[INFO] [2020-07-20 11:24:12] Traits to sex term...
[INFO] [2020-07-20 11:24:12] Traits to lifestage term...
[INFO] [2020-07-20 11:24:12] MetaTraits to traits...
[INFO] [2020-07-20 11:24:23] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-07-20 11:24:30] Assocs to occurrences...
[INFO] [2020-07-20 11:24:30] Assocs to nodes...
[INFO] [2020-07-20 11:24:30] Assoc to sex term...
[INFO] [2020-07-20 11:24:30] Assoc to lifestage term...
[STOP] [2020-07-20 11:24:30] resolve_keys
[START] [2020-07-20 11:24:30] hold_for_later_1
[STOP] [2020-07-20 11:24:30] hold_for_later_1
[START] [2020-07-20 11:24:30] hold_for_later_2
[STOP] [2020-07-20 11:24:30] hold_for_later_2
[START] [2020-07-20 11:24:30] resolve_missing_parents
[STOP] [2020-07-20 11:24:32] resolve_missing_parents
[START] [2020-07-20 11:24:32] rebuild_nodes
[START] [2020-07-20 11:24:32] Flattener#flatten
[START] [2020-07-20 11:24:32] Flattener#study_resource
[START] [2020-07-20 11:24:32] Flattener#build_ancestry
[STOP] [2020-07-20 11:24:35] Flattener#build_ancestry
[INFO] [2020-07-20 11:24:35] 13939 ancestry keys
[START] [2020-07-20 11:24:35] build_node_ancestors
[INFO] [2020-07-20 11:24:35] old ancestors deleted.
[STOP] [2020-07-20 11:24:38] build_node_ancestors
[START] [2020-07-20 11:24:40] Flattener#propagate_ancestor_ids
[STOP] [2020-07-20 11:24:42] Flattener#propagate_ancestor_ids
[STOP] [2020-07-20 11:24:42] Flattener#flatten
[STOP] [2020-07-20 11:24:42] rebuild_nodes
[START] [2020-07-20 11:24:42] resolve_missing_media_owners
[STOP] [2020-07-20 11:24:42] resolve_missing_media_owners
[START] [2020-07-20 11:24:42] sanitize_media_verbatims
[STOP] [2020-07-20 11:24:42] sanitize_media_verbatims
[START] [2020-07-20 11:24:42] queue_downloads
[STOP] [2020-07-20 11:24:42] queue_downloads
[START] [2020-07-20 11:24:42] parse_names
[WARN] [2020-07-20 11:24:42] I see 13939 names which still need to be parsed.
[WARN] [2020-07-20 11:24:52] I see 230 names which still need to be parsed.
[WARN] [2020-07-20 11:24:54] I see 8 names which still need to be parsed.
[WARN] [2020-07-20 11:24:55] I see 3 names which still need to be parsed.
[WARN] [2020-07-20 11:24:56] I see 2 names which still need to be parsed.
[WARN] [2020-07-20 11:24:57] I see 1 names which still need to be parsed.
[STOP] [2020-07-20 11:24:58] parse_names
[START] [2020-07-20 11:24:58] denormalize_canonical_names_to_nodes
[STOP] [2020-07-20 11:24:58] denormalize_canonical_names_to_nodes
[START] [2020-07-20 11:24:58] match_nodes
[START] [2020-07-20 11:24:58] map_all_nodes_to_pages
[STOP] [2020-07-20 11:39:49] map_all_nodes_to_pages
[INFO] [2020-07-20 11:39:49] 356 Unmatched nodes (of 13939)! That's too many to output. First 10: Oreochromis andersonii × Oreochromis niloticus (#80605528); Oreochromis aureus × Oreochromis niloticus (#80614012); E. fuscoguttatus × E. lanceolatus (#80608340); Centracanthidae (#80606517); Branchiostegidae (#80606685); Branchiostegidae (#80616653); Centrogeniidae (#80608232); Inermiidae (#80610236); Coiidae (#80610567); Notograptidae (#80612921)
[START] [2020-07-20 11:39:49] update_nodes
[STOP] [2020-07-20 11:39:55] update_nodes
[STOP] [2020-07-20 11:39:55] match_nodes
[START] [2020-07-20 11:39:55] reindex_search
[STOP] [2020-07-20 11:40:13] reindex_search
[START] [2020-07-20 11:40:13] normalize_units
[STOP] [2020-07-20 11:48:57] normalize_units
[START] [2020-07-20 11:48:57] calculate_statistics
[STOP] [2020-07-20 11:48:57] calculate_statistics
[START] [2020-07-20 11:48:57] complete_harvest_instance
[START] [2020-07-20 11:48:57] overall_tsv_creation
[INFO] [2020-07-20 11:48:57] Processing group of 13939 in 2 batches of 10000
[INFO] [2020-07-20 11:50:16] 17861 Traits (unfiltered)...
[INFO] [2020-07-20 11:52:12] 17861 Traits (filtered)...
[INFO] [2020-07-20 11:52:15] 0 Associations (filtered)...
[INFO] [2020-07-20 11:52:22] 90873 metadata added.
[INFO] [2020-07-20 11:52:22] 0 metadata added.
[INFO] [2020-07-20 11:53:27] 8029 Traits (unfiltered)...
[INFO] [2020-07-20 11:54:40] 8029 Traits (filtered)...
[INFO] [2020-07-20 11:54:43] 0 Associations (filtered)...
[INFO] [2020-07-20 11:54:48] 44654 metadata added.
[INFO] [2020-07-20 11:54:48] 0 metadata added.
[INFO] [2020-07-20 11:54:49] Average Time: 145.3
[INFO] [2020-07-20 11:54:49] Total Time: 5m52s
[STOP] [2020-07-20 11:54:49] overall_tsv_creation
[INFO] [2020-07-20 11:54:49] Done. Check your files:
[INFO] [2020-07-20 11:54:49] (13939 lines) /app/public/data/fao_fishery_stat/publish_nodes.tsv
[INFO] [2020-07-20 11:54:49] (50551 lines) /app/public/data/fao_fishery_stat/publish_node_ancestors.tsv
[INFO] [2020-07-20 11:54:49] (13939 lines) /app/public/data/fao_fishery_stat/publish_scientific_names.tsv
[INFO] [2020-07-20 11:54:49] (25891 lines) /app/public/data/fao_fishery_stat/publish_traits.tsv
[INFO] [2020-07-20 11:54:49] (109638 lines) /app/public/data/fao_fishery_stat/publish_metadata.tsv
[STOP] [2020-07-20 11:54:49] complete_harvest_instance
[START] [2020-07-20 11:54:49] completed
[STOP] [2020-07-20 11:54:49] completed
[STOP] [2020-07-20 11:54:49] logged process, took 2543.21
[INFO] [2020-07-21 08:33:23] ## HARVEST: type = re_download_opendata_-harvest
[INFO] [2020-07-21 08:33:23] ## remove_type: ScientificName
[INFO] [2020-07-21 08:33:23] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-21 08:33:24] [08:33:24.988] Removed 13939 Scientificnames
[INFO] [2020-07-21 08:33:24] ## remove_type: Vernacular
[INFO] [2020-07-21 08:33:24] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:33:24] [08:33:24.991] Removed 0 Vernaculars
[INFO] [2020-07-21 08:33:24] ## remove_type: Article
[INFO] [2020-07-21 08:33:24] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:33:24] [08:33:24.994] Removed 0 Articles
[INFO] [2020-07-21 08:33:24] ## remove_type: Medium
[INFO] [2020-07-21 08:33:24] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:33:24] [08:33:24.997] Removed 0 Media
[INFO] [2020-07-21 08:33:24] ## remove_type: Trait
[INFO] [2020-07-21 08:33:25] ++ Calling delete_all on 151141 instances...
[INFO] [2020-07-21 08:33:59] [08:33:59.457] Removed 151141 Traits
[INFO] [2020-07-21 08:33:59] ## remove_type: MetaTrait
[INFO] [2020-07-21 08:33:59] ++ Calling delete_all on 163107 instances...
[INFO] [2020-07-21 08:34:06] [08:34:06.922] Removed 163107 Metatraits
[INFO] [2020-07-21 08:34:06] ## remove_type: OccurrenceMetadatum
[INFO] [2020-07-21 08:34:06] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:34:06] [08:34:06.926] Removed 0 Occurrencemetadata
[INFO] [2020-07-21 08:34:06] ## remove_type: Assoc
[INFO] [2020-07-21 08:34:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:34:07] [08:34:07.020] Removed 0 Assocs
[INFO] [2020-07-21 08:34:07] ## remove_type: MetaAssoc
[INFO] [2020-07-21 08:34:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:34:07] [08:34:07.034] Removed 0 Metaassocs
[INFO] [2020-07-21 08:34:07] ## remove_type: Identifier
[INFO] [2020-07-21 08:34:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:34:07] [08:34:07.059] Removed 0 Identifiers
[INFO] [2020-07-21 08:34:07] ## remove_type: Reference
[INFO] [2020-07-21 08:34:07] ++ Calling delete_all on 0 instances...
[INFO] [2020-07-21 08:34:07] [08:34:07.062] Removed 0 References
[INFO] [2020-07-21 08:34:07] Starting batch with ID 80605014...
[INFO] [2020-07-21 08:34:08] Starting batch with ID 80616201...
[INFO] [2020-07-21 08:34:09] Starting batch with ID 80616201...
[INFO] [2020-07-21 08:34:10] Starting batch with ID 80612369...
[INFO] [2020-07-21 08:34:11] Starting batch with ID 80612369...
[INFO] [2020-07-21 08:34:11] ## remove_type: Node
[INFO] [2020-07-21 08:34:11] ++ Calling delete_all on 13939 instances...
[INFO] [2020-07-21 08:34:13] [08:34:13.018] Removed 13939 Nodes
[START] [2020-07-21 08:34:16] logged process
[START] [2020-07-21 08:34:16] Creating resource from OpenData
[START] [2020-07-21 08:34:16] logged process
[START] [2020-07-21 08:34:16] Parse meta.xml file and create formats with fields
[STOP] [2020-07-21 08:34:17] Parse meta.xml file and create formats with fields
[STOP] [2020-07-21 08:34:17] Creating resource from OpenData
[START] [2020-07-21 08:34:17] logged process
[START] [2020-07-21 08:34:17] create_harvest_instance
[STOP] [2020-07-21 08:34:18] create_harvest_instance
[START] [2020-07-21 08:34:18] fetch_files
[STOP] [2020-07-21 08:34:18] fetch_files
[START] [2020-07-21 08:34:18] validate_each_file
[STOP] [2020-07-21 08:34:24] validate_each_file
[START] [2020-07-21 08:34:24] convert_to_csv
[CMD] [2020-07-21 08:34:24] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_nodes_21921.csv > /app/public/converted_csv/fao_fishery_stat_nodes_21921.csv_sorted
[CMD] [2020-07-21 08:34:24] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_occurrences_21922.csv > /app/public/converted_csv/fao_fishery_stat_occurrences_21922.csv_sorted
[CMD] [2020-07-21 08:34:24] /usr/bin/sort /app/public/converted_csv/fao_fishery_stat_measurements_21923.csv > /app/public/converted_csv/fao_fishery_stat_measurements_21923.csv_sorted
[STOP] [2020-07-21 08:34:24] convert_to_csv
[START] [2020-07-21 08:34:24] calculate_delta
[CMD] [2020-07-21 08:34:24] echo "0a" > /app/public/diff/fao_fishery_stat_nodes_21921.diff
[CMD] [2020-07-21 08:34:24] tail -n +1 /app/public/converted_csv/fao_fishery_stat_nodes_21921.csv >> /app/public/diff/fao_fishery_stat_nodes_21921.diff
[CMD] [2020-07-21 08:34:24] echo "." >> /app/public/diff/fao_fishery_stat_nodes_21921.diff
[CMD] [2020-07-21 08:34:24] echo "0a" > /app/public/diff/fao_fishery_stat_occurrences_21922.diff
[CMD] [2020-07-21 08:34:24] tail -n +1 /app/public/converted_csv/fao_fishery_stat_occurrences_21922.csv >> /app/public/diff/fao_fishery_stat_occurrences_21922.diff
[CMD] [2020-07-21 08:34:24] echo "." >> /app/public/diff/fao_fishery_stat_occurrences_21922.diff
[CMD] [2020-07-21 08:34:24] echo "0a" > /app/public/diff/fao_fishery_stat_measurements_21923.diff
[CMD] [2020-07-21 08:34:24] tail -n +1 /app/public/converted_csv/fao_fishery_stat_measurements_21923.csv >> /app/public/diff/fao_fishery_stat_measurements_21923.diff
[CMD] [2020-07-21 08:34:24] echo "." >> /app/public/diff/fao_fishery_stat_measurements_21923.diff
[STOP] [2020-07-21 08:34:24] calculate_delta
[START] [2020-07-21 08:34:24] parse_diff_and_store
[INFO] [2020-07-21 08:34:24] Loading nodes diff file into memory (true lines)...
[INFO] [2020-07-21 08:34:28] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-07-21 08:34:29] Loading measurements diff file into memory (true lines)...
[INFO] [2020-07-21 08:43:59] Storing 13939 ScientificNames
[INFO] [2020-07-21 08:43:59] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-21 08:44:06] Average Time: 0.486
[INFO] [2020-07-21 08:44:06] Total Time: 7s
[INFO] [2020-07-21 08:44:06] last 3 / first 3: 0.98
[INFO] [2020-07-21 08:44:06] Std.Dev: 0.5403702434442518; Max: 2.36
[INFO] [2020-07-21 08:44:06] Storing 13939 Nodes
[INFO] [2020-07-21 08:44:06] Processing group of 13939 in 14 groups of 1000
[INFO] [2020-07-21 08:44:10] Average Time: 0.28
[INFO] [2020-07-21 08:44:10] Total Time: 4s
[INFO] [2020-07-21 08:44:10] last 3 / first 3: 1.05
[INFO] [2020-07-21 08:44:10] Std.Dev: 0.03162277660168379; Max: 0.38
[INFO] [2020-07-21 08:44:10] Storing 12418 Occurrences
[INFO] [2020-07-21 08:44:10] Processing group of 12418 in 13 groups of 1000
[INFO] [2020-07-21 08:44:11] Average Time: 0.118
[INFO] [2020-07-21 08:44:11] Total Time: 2s
[INFO] [2020-07-21 08:44:11] last 3 / first 3: 1.09
[INFO] [2020-07-21 08:44:11] Std.Dev: 0.044721359549995794; Max: 0.21
[INFO] [2020-07-21 08:44:11] Storing 140818 Traits
[INFO] [2020-07-21 08:44:11] Processing group of 140818 in 141 groups of 1000
[INFO] [2020-07-21 08:44:49] Average Time: 0.262
[INFO] [2020-07-21 08:44:49] Total Time: 38s
[INFO] [2020-07-21 08:44:49] last 3 / first 3: 0.9
[INFO] [2020-07-21 08:44:49] Std.Dev: 0.15491933384829668; Max: 2.04
[INFO] [2020-07-21 08:44:49] Storing 152784 MetaTraits
[INFO] [2020-07-21 08:44:49] Processing group of 152784 in 153 groups of 1000
[INFO] [2020-07-21 08:45:06] Average Time: 0.107
[INFO] [2020-07-21 08:45:06] Total Time: 17s
[INFO] [2020-07-21 08:45:06] last 3 / first 3: 0.89
[INFO] [2020-07-21 08:45:06] Std.Dev: 0.1224744871391589; Max: 1.59
[STOP] [2020-07-21 08:45:06] parse_diff_and_store
[START] [2020-07-21 08:45:06] resolve_keys
[INFO] [2020-07-21 08:45:19] Occurrences to nodes (through scientific_names)...
[INFO] [2020-07-21 08:45:20] traits to occurrences...
[INFO] [2020-07-21 08:45:21] traits to nodes (through occurrences)...
[INFO] [2020-07-21 08:45:21] Traits to sex term...
[INFO] [2020-07-21 08:45:22] Traits to lifestage term...
[INFO] [2020-07-21 08:45:22] MetaTraits to traits...
[INFO] [2020-07-21 08:45:32] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-07-21 08:45:39] Assocs to occurrences...
[INFO] [2020-07-21 08:45:39] Assocs to nodes...
[INFO] [2020-07-21 08:45:39] Assoc to sex term...
[INFO] [2020-07-21 08:45:39] Assoc to lifestage term...
[STOP] [2020-07-21 08:45:39] resolve_keys
[START] [2020-07-21 08:45:39] hold_for_later_1
[STOP] [2020-07-21 08:45:39] hold_for_later_1
[START] [2020-07-21 08:45:39] hold_for_later_2
[STOP] [2020-07-21 08:45:39] hold_for_later_2
[START] [2020-07-21 08:45:39] resolve_missing_parents
[STOP] [2020-07-21 08:45:41] resolve_missing_parents
[START] [2020-07-21 08:45:41] rebuild_nodes
[START] [2020-07-21 08:45:41] Flattener#flatten
[START] [2020-07-21 08:45:41] Flattener#study_resource
[START] [2020-07-21 08:45:41] Flattener#build_ancestry
[STOP] [2020-07-21 08:45:41] Flattener#build_ancestry
[INFO] [2020-07-21 08:45:41] 13939 ancestry keys
[START] [2020-07-21 08:45:41] build_node_ancestors
[INFO] [2020-07-21 08:45:41] old ancestors deleted.
[STOP] [2020-07-21 08:45:43] build_node_ancestors
[START] [2020-07-21 08:45:47] Flattener#propagate_ancestor_ids
[STOP] [2020-07-21 08:45:48] Flattener#propagate_ancestor_ids
[STOP] [2020-07-21 08:45:48] Flattener#flatten
[STOP] [2020-07-21 08:45:48] rebuild_nodes
[START] [2020-07-21 08:45:48] resolve_missing_media_owners
[STOP] [2020-07-21 08:45:48] resolve_missing_media_owners
[START] [2020-07-21 08:45:48] sanitize_media_verbatims
[STOP] [2020-07-21 08:45:48] sanitize_media_verbatims
[START] [2020-07-21 08:45:48] queue_downloads
[STOP] [2020-07-21 08:45:48] queue_downloads
[START] [2020-07-21 08:45:48] parse_names
[WARN] [2020-07-21 08:45:48] I see 13939 names which still need to be parsed.
[WARN] [2020-07-21 08:45:58] I see 230 names which still need to be parsed.
[WARN] [2020-07-21 08:46:02] I see 8 names which still need to be parsed.
[WARN] [2020-07-21 08:46:03] I see 3 names which still need to be parsed.
[WARN] [2020-07-21 08:46:04] I see 2 names which still need to be parsed.
[WARN] [2020-07-21 08:46:05] I see 1 names which still need to be parsed.
[STOP] [2020-07-21 08:46:07] parse_names
[START] [2020-07-21 08:46:07] denormalize_canonical_names_to_nodes
[STOP] [2020-07-21 08:46:07] denormalize_canonical_names_to_nodes
[START] [2020-07-21 08:46:07] match_nodes
[START] [2020-07-21 08:46:07] map_all_nodes_to_pages
[STOP] [2020-07-21 09:02:05] map_all_nodes_to_pages
[INFO] [2020-07-21 09:02:05] 356 Unmatched nodes (of 13939)! That's too many to output. First 10: Oreochromis andersonii × Oreochromis niloticus (#80619467); Oreochromis aureus × Oreochromis niloticus (#80627951); E. fuscoguttatus × E. lanceolatus (#80622279); Centracanthidae (#80620456); Branchiostegidae (#80620624); Branchiostegidae (#80630592); Centrogeniidae (#80622171); Inermiidae (#80624175); Coiidae (#80624506); Notograptidae (#80626860)
[START] [2020-07-21 09:02:05] update_nodes
[STOP] [2020-07-21 09:02:10] update_nodes
[STOP] [2020-07-21 09:02:10] match_nodes
[START] [2020-07-21 09:02:10] reindex_search
[STOP] [2020-07-21 09:02:30] reindex_search
[START] [2020-07-21 09:02:30] normalize_units
[STOP] [2020-07-21 09:11:12] normalize_units
[START] [2020-07-21 09:11:12] calculate_statistics
[STOP] [2020-07-21 09:11:12] calculate_statistics
[START] [2020-07-21 09:11:12] complete_harvest_instance
[START] [2020-07-21 09:11:12] overall_tsv_creation
[INFO] [2020-07-21 09:11:13] Processing group of 13939 in 2 batches of 10000
[INFO] [2020-07-21 09:13:14] 10549 Traits (unfiltered)...
[INFO] [2020-07-21 09:14:40] 10549 Traits (filtered)...
[INFO] [2020-07-21 09:14:44] 0 Associations (filtered)...
[INFO] [2020-07-21 09:14:49] 83561 metadata added.
[INFO] [2020-07-21 09:14:49] 0 metadata added.
[INFO] [2020-07-21 09:15:56] 5018 Traits (unfiltered)...
[INFO] [2020-07-21 09:16:56] 5018 Traits (filtered)...
[INFO] [2020-07-21 09:16:59] 0 Associations (filtered)...
[INFO] [2020-07-21 09:17:03] 41643 metadata added.
[INFO] [2020-07-21 09:17:03] 0 metadata added.
[INFO] [2020-07-21 09:17:03] Average Time: 122.225
[INFO] [2020-07-21 09:17:03] Total Time: 5m51s
[STOP] [2020-07-21 09:17:03] overall_tsv_creation
[INFO] [2020-07-21 09:17:03] Done. Check your files:
[INFO] [2020-07-21 09:17:03] (13939 lines) /app/public/data/fao_fishery_stat/publish_nodes.tsv
[INFO] [2020-07-21 09:17:03] (50551 lines) /app/public/data/fao_fishery_stat/publish_node_ancestors.tsv
[INFO] [2020-07-21 09:17:03] (13939 lines) /app/public/data/fao_fishery_stat/publish_scientific_names.tsv
[INFO] [2020-07-21 09:17:03] (15568 lines) /app/public/data/fao_fishery_stat/publish_traits.tsv
[INFO] [2020-07-21 09:17:03] (109638 lines) /app/public/data/fao_fishery_stat/publish_metadata.tsv
[STOP] [2020-07-21 09:17:04] complete_harvest_instance
[START] [2020-07-21 09:17:04] completed
[STOP] [2020-07-21 09:17:04] completed
[STOP] [2020-07-21 09:17:04] logged process, took 2567.18

Latest Process