Harvest for Bismarck Sea Species List Created 22 Dec 22:32

Stage: completed
Fetched: 22 Dec 22:32
Validated: 22 Dec 22:32
Deltas Created 22 Dec 22:32
Units Normalized: 22 Dec 23:01
Ancestry Built: 22 Dec 22:34
Nodes Matched: 22 Dec 23:01
Names Parsed: 22 Dec 22:35
New Models Stored: 22 Dec 22:33
Indexed: 22 Dec 23:01
Completed: 22 Dec 23:04
Time to Harvest: 1 minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-12-22 22:32:39 -0500 by logger.rb/56815
[START] [2019-12-22 22:32:39] logged process
[START] [2019-12-22 22:32:39] create_harvest_instance
[STOP] [2019-12-22 22:32:39] create_harvest_instance
[START] [2019-12-22 22:32:39] fetch_files
[STOP] [2019-12-22 22:32:39] fetch_files
[START] [2019-12-22 22:32:39] validate_each_file
[STOP] [2019-12-22 22:32:40] validate_each_file
[START] [2019-12-22 22:32:40] convert_to_csv
[CMD] [2019-12-22 22:32:40] /usr/bin/sort /app/public/converted_csv/bismarck_sea_sp__refs_19130.csv > /app/public/converted_csv/bismarck_sea_sp__refs_19130.csv_sorted
[CMD] [2019-12-22 22:32:41] /usr/bin/sort /app/public/converted_csv/bismarck_sea_sp__nodes_19131.csv > /app/public/converted_csv/bismarck_sea_sp__nodes_19131.csv_sorted
[CMD] [2019-12-22 22:32:42] /usr/bin/sort /app/public/converted_csv/bismarck_sea_sp__occurrences_19132.csv > /app/public/converted_csv/bismarck_sea_sp__occurrences_19132.csv_sorted
[CMD] [2019-12-22 22:32:42] /usr/bin/sort /app/public/converted_csv/bismarck_sea_sp__measurements_19133.csv > /app/public/converted_csv/bismarck_sea_sp__measurements_19133.csv_sorted
[STOP] [2019-12-22 22:32:43] convert_to_csv
[START] [2019-12-22 22:32:43] calculate_delta
[CMD] [2019-12-22 22:32:43] echo "0a" > /app/public/diff/bismarck_sea_sp__refs_19130.diff
[CMD] [2019-12-22 22:32:44] tail -n +1 /app/public/converted_csv/bismarck_sea_sp__refs_19130.csv >> /app/public/diff/bismarck_sea_sp__refs_19130.diff
[CMD] [2019-12-22 22:32:45] echo "." >> /app/public/diff/bismarck_sea_sp__refs_19130.diff
[CMD] [2019-12-22 22:32:45] echo "0a" > /app/public/diff/bismarck_sea_sp__nodes_19131.diff
[CMD] [2019-12-22 22:32:46] tail -n +1 /app/public/converted_csv/bismarck_sea_sp__nodes_19131.csv >> /app/public/diff/bismarck_sea_sp__nodes_19131.diff
[CMD] [2019-12-22 22:32:46] echo "." >> /app/public/diff/bismarck_sea_sp__nodes_19131.diff
[CMD] [2019-12-22 22:32:47] echo "0a" > /app/public/diff/bismarck_sea_sp__occurrences_19132.diff
[CMD] [2019-12-22 22:32:48] tail -n +1 /app/public/converted_csv/bismarck_sea_sp__occurrences_19132.csv >> /app/public/diff/bismarck_sea_sp__occurrences_19132.diff
[CMD] [2019-12-22 22:32:49] echo "." >> /app/public/diff/bismarck_sea_sp__occurrences_19132.diff
[CMD] [2019-12-22 22:32:49] echo "0a" > /app/public/diff/bismarck_sea_sp__measurements_19133.diff
[CMD] [2019-12-22 22:32:50] tail -n +1 /app/public/converted_csv/bismarck_sea_sp__measurements_19133.csv >> /app/public/diff/bismarck_sea_sp__measurements_19133.diff
[CMD] [2019-12-22 22:32:51] echo "." >> /app/public/diff/bismarck_sea_sp__measurements_19133.diff
[STOP] [2019-12-22 22:32:51] calculate_delta
[START] [2019-12-22 22:32:51] parse_diff_and_store
[INFO] [2019-12-22 22:32:52] Loading refs diff file into memory (true lines)...
[INFO] [2019-12-22 22:32:53] Loading nodes diff file into memory (true lines)...
[INFO] [2019-12-22 22:32:56] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-22 22:32:58] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-22 22:33:28] Storing 2 References
[INFO] [2019-12-22 22:33:28] Processing group of 2 in 1 groups of 1000
[INFO] [2019-12-22 22:33:28] Average Time: 0.0
[INFO] [2019-12-22 22:33:28] Total Time: 1s
[INFO] [2019-12-22 22:33:28] Storing 8603 ScientificNames
[INFO] [2019-12-22 22:33:28] Processing group of 8603 in 9 groups of 1000
[INFO] [2019-12-22 22:33:32] Average Time: 0.427
[INFO] [2019-12-22 22:33:32] Total Time: 4s
[INFO] [2019-12-22 22:33:32] last 3 / first 3: 1.03
[INFO] [2019-12-22 22:33:32] Std.Dev: 0.1341640786499874; Max: 0.71
[INFO] [2019-12-22 22:33:32] Storing 8603 Nodes
[INFO] [2019-12-22 22:33:32] Processing group of 8603 in 9 groups of 1000
[INFO] [2019-12-22 22:33:35] Average Time: 0.318
[INFO] [2019-12-22 22:33:35] Total Time: 3s
[INFO] [2019-12-22 22:33:35] last 3 / first 3: 0.96
[INFO] [2019-12-22 22:33:35] Std.Dev: 0.03162277660168379; Max: 0.35
[INFO] [2019-12-22 22:33:35] Storing 5220 Occurrences
[INFO] [2019-12-22 22:33:35] Processing group of 5220 in 6 groups of 1000
[INFO] [2019-12-22 22:33:35] Average Time: 0.108
[INFO] [2019-12-22 22:33:35] Total Time: 1s
[INFO] [2019-12-22 22:33:35] Storing 10440 TraitsReferences
[INFO] [2019-12-22 22:33:35] Processing group of 10440 in 11 groups of 1000
[INFO] [2019-12-22 22:33:36] Average Time: 0.079
[INFO] [2019-12-22 22:33:36] Total Time: 1s
[INFO] [2019-12-22 22:33:36] last 3 / first 3: 0.63
[INFO] [2019-12-22 22:33:36] Std.Dev: 0.03162277660168379; Max: 0.15
[INFO] [2019-12-22 22:33:36] Storing 10440 Traits
[INFO] [2019-12-22 22:33:36] Processing group of 10440 in 11 groups of 1000
[INFO] [2019-12-22 22:33:41] Average Time: 0.396
[INFO] [2019-12-22 22:33:41] Total Time: 5s
[INFO] [2019-12-22 22:33:41] last 3 / first 3: 0.67
[INFO] [2019-12-22 22:33:41] Std.Dev: 0.12649110640673517; Max: 0.67
[INFO] [2019-12-22 22:33:41] Storing 10437 MetaTraits
[INFO] [2019-12-22 22:33:41] Processing group of 10437 in 11 groups of 1000
[INFO] [2019-12-22 22:33:42] Average Time: 0.15
[INFO] [2019-12-22 22:33:42] Total Time: 2s
[INFO] [2019-12-22 22:33:42] last 3 / first 3: 0.77
[INFO] [2019-12-22 22:33:42] Std.Dev: 0.03162277660168379; Max: 0.2
[STOP] [2019-12-22 22:33:42] parse_diff_and_store
[START] [2019-12-22 22:33:42] resolve_keys
[INFO] [2019-12-22 22:34:16] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-22 22:34:19] traits to occurrences...
[INFO] [2019-12-22 22:34:21] traits to nodes (through occurrences)...
[INFO] [2019-12-22 22:34:21] Traits to sex term...
[INFO] [2019-12-22 22:34:23] Traits to lifestage term...
[INFO] [2019-12-22 22:34:25] MetaTraits to traits...
[INFO] [2019-12-22 22:34:26] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-22 22:34:27] Assocs to occurrences...
[INFO] [2019-12-22 22:34:27] Assocs to nodes...
[INFO] [2019-12-22 22:34:27] Assoc to sex term...
[INFO] [2019-12-22 22:34:27] Assoc to lifestage term...
[STOP] [2019-12-22 22:34:27] resolve_keys
[START] [2019-12-22 22:34:27] hold_for_later_1
[STOP] [2019-12-22 22:34:27] hold_for_later_1
[START] [2019-12-22 22:34:27] hold_for_later_2
[STOP] [2019-12-22 22:34:27] hold_for_later_2
[START] [2019-12-22 22:34:27] resolve_missing_parents
[STOP] [2019-12-22 22:34:47] resolve_missing_parents
[START] [2019-12-22 22:34:47] rebuild_nodes
[START] [2019-12-22 22:34:47] Flattener#flatten
[START] [2019-12-22 22:34:47] Flattener#study_resource
[START] [2019-12-22 22:34:47] Flattener#build_ancestry
[STOP] [2019-12-22 22:34:47] Flattener#build_ancestry
[INFO] [2019-12-22 22:34:47] 8603 ancestry keys
[START] [2019-12-22 22:34:47] build_node_ancestors
[INFO] [2019-12-22 22:34:47] old ancestors deleted.
[STOP] [2019-12-22 22:34:51] build_node_ancestors
[START] [2019-12-22 22:34:54] Flattener#propagate_ancestor_ids
[STOP] [2019-12-22 22:34:55] Flattener#propagate_ancestor_ids
[STOP] [2019-12-22 22:34:55] Flattener#flatten
[STOP] [2019-12-22 22:34:55] rebuild_nodes
[START] [2019-12-22 22:34:55] resolve_missing_media_owners
[STOP] [2019-12-22 22:34:55] resolve_missing_media_owners
[START] [2019-12-22 22:34:55] sanitize_media_verbatims
[STOP] [2019-12-22 22:34:55] sanitize_media_verbatims
[START] [2019-12-22 22:34:55] queue_downloads
[STOP] [2019-12-22 22:34:55] queue_downloads
[START] [2019-12-22 22:34:55] parse_names
[WARN] [2019-12-22 22:34:55] I see 8603 names which still need to be parsed.
[STOP] [2019-12-22 22:35:04] parse_names
[START] [2019-12-22 22:35:04] denormalize_canonical_names_to_nodes
[STOP] [2019-12-22 22:35:04] denormalize_canonical_names_to_nodes
[START] [2019-12-22 22:35:04] match_nodes
[START] [2019-12-22 22:35:04] map_all_nodes_to_pages
[STOP] [2019-12-22 23:01:15] map_all_nodes_to_pages
[INFO] [2019-12-22 23:01:15] 478 Unmatched nodes (of 8603)! That's too many to output. First 10: Apogon striatus (#61776602); Fowleria auritus (#61775798); Giuris aporos (#61773087); Scarus forsteri (#61775746); Ptereleotridae (#61770800); Trachinocephalus (#61769477); Strophidon macrura (#61776077); Sargocentron cornutus (#61775477); Scorpaeniformes (#61768348); Sunagocia malayanus (#61775023)
[START] [2019-12-22 23:01:15] update_nodes
[STOP] [2019-12-22 23:01:19] update_nodes
[STOP] [2019-12-22 23:01:19] match_nodes
[START] [2019-12-22 23:01:19] reindex_search
[STOP] [2019-12-22 23:01:41] reindex_search
[START] [2019-12-22 23:01:41] normalize_units
[STOP] [2019-12-22 23:01:41] normalize_units
[START] [2019-12-22 23:01:41] calculate_statistics
[STOP] [2019-12-22 23:01:41] calculate_statistics
[START] [2019-12-22 23:01:41] complete_harvest_instance
[START] [2019-12-22 23:01:41] overall_tsv_creation
[INFO] [2019-12-22 23:01:41] Processing group of 8603 in 1 batches of 10000
[INFO] [2019-12-22 23:03:08] 5220 Traits (unfiltered)...
[INFO] [2019-12-22 23:03:21] 5220 Traits (filtered)...
[INFO] [2019-12-22 23:03:21] 0 Associations (filtered)...
[INFO] [2019-12-22 23:04:10] 26097 metadata added.
[INFO] [2019-12-22 23:04:10] 0 metadata added.
[INFO] [2019-12-22 23:04:10] Average Time: 119.73
[INFO] [2019-12-22 23:04:10] Total Time: 2m29s
[STOP] [2019-12-22 23:04:10] overall_tsv_creation
[INFO] [2019-12-22 23:04:10] Done. Check your files:
[INFO] [2019-12-22 23:04:10] (8603 lines) /app/public/data/bismarck_sea_sp_/publish_nodes.tsv
[INFO] [2019-12-22 23:04:11] (46062 lines) /app/public/data/bismarck_sea_sp_/publish_node_ancestors.tsv
[INFO] [2019-12-22 23:04:12] (8603 lines) /app/public/data/bismarck_sea_sp_/publish_scientific_names.tsv
[INFO] [2019-12-22 23:04:12] (5221 lines) /app/public/data/bismarck_sea_sp_/publish_traits.tsv
[INFO] [2019-12-22 23:04:13] (26098 lines) /app/public/data/bismarck_sea_sp_/publish_metadata.tsv
[STOP] [2019-12-22 23:04:13] complete_harvest_instance
[START] [2019-12-22 23:04:13] completed
[STOP] [2019-12-22 23:04:13] completed
[STOP] [2019-12-22 23:04:13] logged process, took 1894.76

Latest Process