Harvest for Southern Ocean Species List Created 25 Dec 14:43

Stage: completed
Fetched: 25 Dec 14:43
Validated: 25 Dec 14:43
Deltas Created 25 Dec 14:44
Units Normalized: 25 Dec 15:45
Ancestry Built: 25 Dec 14:48
Nodes Matched: 25 Dec 15:44
Names Parsed: 25 Dec 14:48
New Models Stored: 25 Dec 14:45
Indexed: 25 Dec 15:45
Completed: 25 Dec 15:50
Time to Harvest: 1 minute

Harvesting Log

(156 lines)
# Logfile created on 2019-12-25 14:43:57 -0500 by logger.rb/56815
[START] [2019-12-25 14:43:57] logged process
[START] [2019-12-25 14:43:57] create_harvest_instance
[STOP] [2019-12-25 14:43:57] create_harvest_instance
[START] [2019-12-25 14:43:57] fetch_files
[STOP] [2019-12-25 14:43:57] fetch_files
[START] [2019-12-25 14:43:57] validate_each_file
[STOP] [2019-12-25 14:43:59] validate_each_file
[START] [2019-12-25 14:43:59] convert_to_csv
[CMD] [2019-12-25 14:43:59] /usr/bin/sort /app/public/converted_csv/Southern_Ocean_S_refs_19690.csv > /app/public/converted_csv/Southern_Ocean_S_refs_19690.csv_sorted
[CMD] [2019-12-25 14:44:00] /usr/bin/sort /app/public/converted_csv/Southern_Ocean_S_nodes_19691.csv > /app/public/converted_csv/Southern_Ocean_S_nodes_19691.csv_sorted
[CMD] [2019-12-25 14:44:01] /usr/bin/sort /app/public/converted_csv/Southern_Ocean_S_occurrences_19692.csv > /app/public/converted_csv/Southern_Ocean_S_occurrences_19692.csv_sorted
[CMD] [2019-12-25 14:44:01] /usr/bin/sort /app/public/converted_csv/Southern_Ocean_S_measurements_19693.csv > /app/public/converted_csv/Southern_Ocean_S_measurements_19693.csv_sorted
[STOP] [2019-12-25 14:44:02] convert_to_csv
[START] [2019-12-25 14:44:02] calculate_delta
[CMD] [2019-12-25 14:44:02] echo "0a" > /app/public/diff/Southern_Ocean_S_refs_19690.diff
[CMD] [2019-12-25 14:44:03] tail -n +1 /app/public/converted_csv/Southern_Ocean_S_refs_19690.csv >> /app/public/diff/Southern_Ocean_S_refs_19690.diff
[CMD] [2019-12-25 14:44:03] echo "." >> /app/public/diff/Southern_Ocean_S_refs_19690.diff
[CMD] [2019-12-25 14:44:04] echo "0a" > /app/public/diff/Southern_Ocean_S_nodes_19691.diff
[CMD] [2019-12-25 14:44:05] tail -n +1 /app/public/converted_csv/Southern_Ocean_S_nodes_19691.csv >> /app/public/diff/Southern_Ocean_S_nodes_19691.diff
[CMD] [2019-12-25 14:44:05] echo "." >> /app/public/diff/Southern_Ocean_S_nodes_19691.diff
[CMD] [2019-12-25 14:44:06] echo "0a" > /app/public/diff/Southern_Ocean_S_occurrences_19692.diff
[CMD] [2019-12-25 14:44:06] tail -n +1 /app/public/converted_csv/Southern_Ocean_S_occurrences_19692.csv >> /app/public/diff/Southern_Ocean_S_occurrences_19692.diff
[CMD] [2019-12-25 14:44:07] echo "." >> /app/public/diff/Southern_Ocean_S_occurrences_19692.diff
[CMD] [2019-12-25 14:44:08] echo "0a" > /app/public/diff/Southern_Ocean_S_measurements_19693.diff
[CMD] [2019-12-25 14:44:08] tail -n +1 /app/public/converted_csv/Southern_Ocean_S_measurements_19693.csv >> /app/public/diff/Southern_Ocean_S_measurements_19693.diff
[CMD] [2019-12-25 14:44:09] echo "." >> /app/public/diff/Southern_Ocean_S_measurements_19693.diff
[STOP] [2019-12-25 14:44:10] calculate_delta
[START] [2019-12-25 14:44:10] parse_diff_and_store
[INFO] [2019-12-25 14:44:10] Loading refs diff file into memory (true lines)...
[INFO] [2019-12-25 14:44:11] Loading nodes diff file into memory (true lines)...
[INFO] [2019-12-25 14:44:17] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-12-25 14:44:19] Loading measurements diff file into memory (true lines)...
[INFO] [2019-12-25 14:45:11] Storing 2 References
[INFO] [2019-12-25 14:45:11] Processing group of 2 in 1 groups of 1000
[INFO] [2019-12-25 14:45:11] Average Time: 0.0
[INFO] [2019-12-25 14:45:11] Total Time: 1s
[INFO] [2019-12-25 14:45:11] Storing 16092 ScientificNames
[INFO] [2019-12-25 14:45:11] Processing group of 16092 in 17 groups of 1000
[INFO] [2019-12-25 14:45:17] Average Time: 0.338
[INFO] [2019-12-25 14:45:17] Total Time: 6s
[INFO] [2019-12-25 14:45:17] last 3 / first 3: 0.71
[INFO] [2019-12-25 14:45:17] Std.Dev: 0.08366600265340755; Max: 0.44
[INFO] [2019-12-25 14:45:17] Storing 16092 Nodes
[INFO] [2019-12-25 14:45:17] Processing group of 16092 in 17 groups of 1000
[INFO] [2019-12-25 14:45:23] Average Time: 0.307
[INFO] [2019-12-25 14:45:23] Total Time: 6s
[INFO] [2019-12-25 14:45:23] last 3 / first 3: 0.86
[INFO] [2019-12-25 14:45:23] Std.Dev: 0.07745966692414834; Max: 0.4
[INFO] [2019-12-25 14:45:23] Storing 9706 Occurrences
[INFO] [2019-12-25 14:45:23] Processing group of 9706 in 10 groups of 1000
[INFO] [2019-12-25 14:45:24] Average Time: 0.155
[INFO] [2019-12-25 14:45:24] Total Time: 2s
[INFO] [2019-12-25 14:45:24] last 3 / first 3: 0.35
[INFO] [2019-12-25 14:45:24] Std.Dev: 0.17029386365926402; Max: 0.64
[INFO] [2019-12-25 14:45:24] Storing 19412 TraitsReferences
[INFO] [2019-12-25 14:45:24] Processing group of 19412 in 20 groups of 1000
[INFO] [2019-12-25 14:45:26] Average Time: 0.072
[INFO] [2019-12-25 14:45:26] Total Time: 2s
[INFO] [2019-12-25 14:45:26] last 3 / first 3: 0.55
[INFO] [2019-12-25 14:45:26] Std.Dev: 0.0; Max: 0.13
[INFO] [2019-12-25 14:45:26] Storing 19412 Traits
[INFO] [2019-12-25 14:45:26] Processing group of 19412 in 20 groups of 1000
[INFO] [2019-12-25 14:45:32] Average Time: 0.329
[INFO] [2019-12-25 14:45:32] Total Time: 7s
[INFO] [2019-12-25 14:45:32] last 3 / first 3: 0.91
[INFO] [2019-12-25 14:45:32] Std.Dev: 0.06324555320336758; Max: 0.41
[INFO] [2019-12-25 14:45:32] Storing 19381 MetaTraits
[INFO] [2019-12-25 14:45:32] Processing group of 19381 in 20 groups of 1000
[INFO] [2019-12-25 14:45:35] Average Time: 0.133
[INFO] [2019-12-25 14:45:35] Total Time: 3s
[INFO] [2019-12-25 14:45:35] last 3 / first 3: 0.66
[INFO] [2019-12-25 14:45:35] Std.Dev: 0.03162277660168379; Max: 0.18
[STOP] [2019-12-25 14:45:35] parse_diff_and_store
[START] [2019-12-25 14:45:35] resolve_keys
[INFO] [2019-12-25 14:47:06] Occurrences to nodes (through scientific_names)...
[INFO] [2019-12-25 14:47:13] traits to occurrences...
[INFO] [2019-12-25 14:47:19] traits to nodes (through occurrences)...
[INFO] [2019-12-25 14:47:19] Traits to sex term...
[INFO] [2019-12-25 14:47:32] Traits to lifestage term...
[INFO] [2019-12-25 14:47:37] MetaTraits to traits...
[INFO] [2019-12-25 14:47:39] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-12-25 14:47:46] Assocs to occurrences...
[INFO] [2019-12-25 14:47:46] Assocs to nodes...
[INFO] [2019-12-25 14:47:46] Assoc to sex term...
[INFO] [2019-12-25 14:47:46] Assoc to lifestage term...
[STOP] [2019-12-25 14:47:46] resolve_keys
[START] [2019-12-25 14:47:46] hold_for_later_1
[STOP] [2019-12-25 14:47:46] hold_for_later_1
[START] [2019-12-25 14:47:46] hold_for_later_2
[STOP] [2019-12-25 14:47:46] hold_for_later_2
[START] [2019-12-25 14:47:46] resolve_missing_parents
[STOP] [2019-12-25 14:48:16] resolve_missing_parents
[START] [2019-12-25 14:48:16] rebuild_nodes
[START] [2019-12-25 14:48:16] Flattener#flatten
[START] [2019-12-25 14:48:16] Flattener#study_resource
[START] [2019-12-25 14:48:16] Flattener#build_ancestry
[STOP] [2019-12-25 14:48:18] Flattener#build_ancestry
[INFO] [2019-12-25 14:48:18] 16092 ancestry keys
[START] [2019-12-25 14:48:18] build_node_ancestors
[INFO] [2019-12-25 14:48:18] old ancestors deleted.
[STOP] [2019-12-25 14:48:24] build_node_ancestors
[START] [2019-12-25 14:48:30] Flattener#propagate_ancestor_ids
[STOP] [2019-12-25 14:48:36] Flattener#propagate_ancestor_ids
[STOP] [2019-12-25 14:48:36] Flattener#flatten
[STOP] [2019-12-25 14:48:36] rebuild_nodes
[START] [2019-12-25 14:48:36] resolve_missing_media_owners
[STOP] [2019-12-25 14:48:36] resolve_missing_media_owners
[START] [2019-12-25 14:48:36] sanitize_media_verbatims
[STOP] [2019-12-25 14:48:36] sanitize_media_verbatims
[START] [2019-12-25 14:48:36] queue_downloads
[STOP] [2019-12-25 14:48:36] queue_downloads
[START] [2019-12-25 14:48:36] parse_names
[WARN] [2019-12-25 14:48:36] I see 16092 names which still need to be parsed.
[STOP] [2019-12-25 14:48:48] parse_names
[START] [2019-12-25 14:48:48] denormalize_canonical_names_to_nodes
[STOP] [2019-12-25 14:48:49] denormalize_canonical_names_to_nodes
[START] [2019-12-25 14:48:49] match_nodes
[START] [2019-12-25 14:48:49] map_all_nodes_to_pages
[STOP] [2019-12-25 15:44:32] map_all_nodes_to_pages
[INFO] [2019-12-25 15:44:32] 1342 Unmatched nodes (of 16092)! That's too many to output. First 10: Themisto bispinosa (#62493933); Eusiroides georgianus (#62489047); Epimeriella (#62484345); Euandania (#62491756); Schellenbergia (#62493646); Schellenbergia vanhoeffeni (#62493645); Orchomene plebs (#62484736); Orchomene rossi (#62484811); Orchomenella acanthura (#62487783); Lepidepecreum carinatum (#62497313)
[START] [2019-12-25 15:44:32] update_nodes
[STOP] [2019-12-25 15:44:39] update_nodes
[STOP] [2019-12-25 15:44:39] match_nodes
[START] [2019-12-25 15:44:39] reindex_search
[STOP] [2019-12-25 15:45:18] reindex_search
[START] [2019-12-25 15:45:18] normalize_units
[STOP] [2019-12-25 15:45:18] normalize_units
[START] [2019-12-25 15:45:18] calculate_statistics
[STOP] [2019-12-25 15:45:19] calculate_statistics
[START] [2019-12-25 15:45:19] complete_harvest_instance
[START] [2019-12-25 15:45:19] overall_tsv_creation
[INFO] [2019-12-25 15:45:19] Processing group of 16092 in 2 batches of 10000
[INFO] [2019-12-25 15:46:52] 5765 Traits (unfiltered)...
[INFO] [2019-12-25 15:47:05] 5765 Traits (filtered)...
[INFO] [2019-12-25 15:47:05] 0 Associations (filtered)...
[INFO] [2019-12-25 15:47:57] 28814 metadata added.
[INFO] [2019-12-25 15:47:57] 0 metadata added.
[INFO] [2019-12-25 15:49:13] 3941 Traits (unfiltered)...
[INFO] [2019-12-25 15:49:26] 3941 Traits (filtered)...
[INFO] [2019-12-25 15:49:26] 0 Associations (filtered)...
[INFO] [2019-12-25 15:50:15] 19685 metadata added.
[INFO] [2019-12-25 15:50:15] 0 metadata added.
[INFO] [2019-12-25 15:50:15] Average Time: 120.685
[INFO] [2019-12-25 15:50:15] Total Time: 4m57s
[STOP] [2019-12-25 15:50:15] overall_tsv_creation
[INFO] [2019-12-25 15:50:15] Done. Check your files:
[INFO] [2019-12-25 15:50:16] (16092 lines) /app/public/data/Southern_Ocean_S/publish_nodes.tsv
[INFO] [2019-12-25 15:50:16] (83580 lines) /app/public/data/Southern_Ocean_S/publish_node_ancestors.tsv
[INFO] [2019-12-25 15:50:17] (16092 lines) /app/public/data/Southern_Ocean_S/publish_scientific_names.tsv
[INFO] [2019-12-25 15:50:18] (9707 lines) /app/public/data/Southern_Ocean_S/publish_traits.tsv
[INFO] [2019-12-25 15:50:18] (48500 lines) /app/public/data/Southern_Ocean_S/publish_metadata.tsv
[STOP] [2019-12-25 15:50:19] complete_harvest_instance
[START] [2019-12-25 15:50:19] completed
[STOP] [2019-12-25 15:50:19] completed
[STOP] [2019-12-25 15:50:19] logged process, took 3981.77

Latest Process