Harvest for Tonga Species List Created 16 Oct 18:20

Stage: completed
Fetched: 16 Oct 18:20
Validated: 16 Oct 18:20
Deltas Created 16 Oct 18:20
Units Normalized: 16 Oct 18:26
Ancestry Built: 16 Oct 18:21
Nodes Matched: 16 Oct 18:26
Names Parsed: 16 Oct 18:21
New Models Stored: 16 Oct 18:20
Indexed: 16 Oct 18:26
Completed: 16 Oct 18:28
Time to Harvest: less than a minute

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-10-16 18:20:10 -0400 by logger.rb/56815
[START] [2019-10-16 18:20:10] logged process
[START] [2019-10-16 18:20:10] create_harvest_instance
[STOP] [2019-10-16 18:20:10] create_harvest_instance
[START] [2019-10-16 18:20:10] fetch_files
[STOP] [2019-10-16 18:20:10] fetch_files
[START] [2019-10-16 18:20:10] validate_each_file
[STOP] [2019-10-16 18:20:11] validate_each_file
[START] [2019-10-16 18:20:11] convert_to_csv
[CMD] [2019-10-16 18:20:11] /usr/bin/sort /app/public/converted_csv/tonga_sp_list_refs_17529.csv > /app/public/converted_csv/tonga_sp_list_refs_17529.csv_sorted
[CMD] [2019-10-16 18:20:11] /usr/bin/sort /app/public/converted_csv/tonga_sp_list_nodes_17530.csv > /app/public/converted_csv/tonga_sp_list_nodes_17530.csv_sorted
[CMD] [2019-10-16 18:20:11] /usr/bin/sort /app/public/converted_csv/tonga_sp_list_occurrences_17531.csv > /app/public/converted_csv/tonga_sp_list_occurrences_17531.csv_sorted
[CMD] [2019-10-16 18:20:11] /usr/bin/sort /app/public/converted_csv/tonga_sp_list_measurements_17532.csv > /app/public/converted_csv/tonga_sp_list_measurements_17532.csv_sorted
[STOP] [2019-10-16 18:20:11] convert_to_csv
[START] [2019-10-16 18:20:11] calculate_delta
[CMD] [2019-10-16 18:20:11] echo "0a" > /app/public/diff/tonga_sp_list_refs_17529.diff
[CMD] [2019-10-16 18:20:11] tail -n +1 /app/public/converted_csv/tonga_sp_list_refs_17529.csv >> /app/public/diff/tonga_sp_list_refs_17529.diff
[CMD] [2019-10-16 18:20:11] echo "." >> /app/public/diff/tonga_sp_list_refs_17529.diff
[CMD] [2019-10-16 18:20:11] echo "0a" > /app/public/diff/tonga_sp_list_nodes_17530.diff
[CMD] [2019-10-16 18:20:11] tail -n +1 /app/public/converted_csv/tonga_sp_list_nodes_17530.csv >> /app/public/diff/tonga_sp_list_nodes_17530.diff
[CMD] [2019-10-16 18:20:12] echo "." >> /app/public/diff/tonga_sp_list_nodes_17530.diff
[CMD] [2019-10-16 18:20:12] echo "0a" > /app/public/diff/tonga_sp_list_occurrences_17531.diff
[CMD] [2019-10-16 18:20:12] tail -n +1 /app/public/converted_csv/tonga_sp_list_occurrences_17531.csv >> /app/public/diff/tonga_sp_list_occurrences_17531.diff
[CMD] [2019-10-16 18:20:12] echo "." >> /app/public/diff/tonga_sp_list_occurrences_17531.diff
[CMD] [2019-10-16 18:20:12] echo "0a" > /app/public/diff/tonga_sp_list_measurements_17532.diff
[CMD] [2019-10-16 18:20:12] tail -n +1 /app/public/converted_csv/tonga_sp_list_measurements_17532.csv >> /app/public/diff/tonga_sp_list_measurements_17532.diff
[CMD] [2019-10-16 18:20:12] echo "." >> /app/public/diff/tonga_sp_list_measurements_17532.diff
[STOP] [2019-10-16 18:20:12] calculate_delta
[START] [2019-10-16 18:20:12] parse_diff_and_store
[INFO] [2019-10-16 18:20:12] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-16 18:20:12] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-16 18:20:14] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-16 18:20:14] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-16 18:20:28] Storing 2 References
[INFO] [2019-10-16 18:20:28] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-16 18:20:28] Average Time: 0.0
[INFO] [2019-10-16 18:20:28] Total Time: 1s
[INFO] [2019-10-16 18:20:28] Storing 4656 ScientificNames
[INFO] [2019-10-16 18:20:28] Processing group of 4656 in 5 groups of 1000
[INFO] [2019-10-16 18:20:30] Average Time: 0.332
[INFO] [2019-10-16 18:20:30] Total Time: 2s
[INFO] [2019-10-16 18:20:30] Storing 4656 Nodes
[INFO] [2019-10-16 18:20:30] Processing group of 4656 in 5 groups of 1000
[INFO] [2019-10-16 18:20:31] Average Time: 0.264
[INFO] [2019-10-16 18:20:31] Total Time: 2s
[INFO] [2019-10-16 18:20:31] Storing 2410 Occurrences
[INFO] [2019-10-16 18:20:31] Processing group of 2410 in 3 groups of 1000
[INFO] [2019-10-16 18:20:32] Average Time: 0.093
[INFO] [2019-10-16 18:20:32] Total Time: 1s
[INFO] [2019-10-16 18:20:32] Storing 5250 TraitsReferences
[INFO] [2019-10-16 18:20:32] Processing group of 5250 in 6 groups of 1000
[INFO] [2019-10-16 18:20:32] Average Time: 0.067
[INFO] [2019-10-16 18:20:32] Total Time: 1s
[INFO] [2019-10-16 18:20:32] Storing 5249 Traits
[INFO] [2019-10-16 18:20:32] Processing group of 5249 in 6 groups of 1000
[INFO] [2019-10-16 18:20:34] Average Time: 0.272
[INFO] [2019-10-16 18:20:34] Total Time: 2s
[INFO] [2019-10-16 18:20:34] Storing 5250 MetaTraits
[INFO] [2019-10-16 18:20:34] Processing group of 5250 in 6 groups of 1000
[INFO] [2019-10-16 18:20:34] Average Time: 0.105
[INFO] [2019-10-16 18:20:34] Total Time: 1s
[STOP] [2019-10-16 18:20:34] parse_diff_and_store
[START] [2019-10-16 18:20:34] resolve_keys
[INFO] [2019-10-16 18:20:54] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-16 18:20:56] traits to occurrences...
[INFO] [2019-10-16 18:20:59] traits to nodes (through occurrences)...
[INFO] [2019-10-16 18:20:59] Traits to sex term...
[INFO] [2019-10-16 18:21:01] Traits to lifestage term...
[INFO] [2019-10-16 18:21:03] MetaTraits to traits...
[INFO] [2019-10-16 18:21:03] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-16 18:21:04] Assocs to occurrences...
[INFO] [2019-10-16 18:21:04] Assocs to nodes...
[INFO] [2019-10-16 18:21:04] Assoc to sex term...
[INFO] [2019-10-16 18:21:04] Assoc to lifestage term...
[STOP] [2019-10-16 18:21:04] resolve_keys
[START] [2019-10-16 18:21:04] hold_for_later_1
[STOP] [2019-10-16 18:21:04] hold_for_later_1
[START] [2019-10-16 18:21:04] hold_for_later_2
[STOP] [2019-10-16 18:21:04] hold_for_later_2
[START] [2019-10-16 18:21:04] resolve_missing_parents
[STOP] [2019-10-16 18:21:13] resolve_missing_parents
[START] [2019-10-16 18:21:13] rebuild_nodes
[START] [2019-10-16 18:21:13] Flattener#flatten
[START] [2019-10-16 18:21:13] Flattener#study_resource
[START] [2019-10-16 18:21:13] Flattener#build_ancestry
[STOP] [2019-10-16 18:21:13] Flattener#build_ancestry
[INFO] [2019-10-16 18:21:13] 4656 ancestry keys
[START] [2019-10-16 18:21:13] build_node_ancestors
[INFO] [2019-10-16 18:21:13] old ancestors deleted.
[STOP] [2019-10-16 18:21:14] build_node_ancestors
[START] [2019-10-16 18:21:15] Flattener#propagate_ancestor_ids
[STOP] [2019-10-16 18:21:15] Flattener#propagate_ancestor_ids
[STOP] [2019-10-16 18:21:15] Flattener#flatten
[STOP] [2019-10-16 18:21:15] rebuild_nodes
[START] [2019-10-16 18:21:15] resolve_missing_media_owners
[STOP] [2019-10-16 18:21:15] resolve_missing_media_owners
[START] [2019-10-16 18:21:15] sanitize_media_verbatims
[STOP] [2019-10-16 18:21:15] sanitize_media_verbatims
[START] [2019-10-16 18:21:15] queue_downloads
[STOP] [2019-10-16 18:21:15] queue_downloads
[START] [2019-10-16 18:21:15] parse_names
[WARN] [2019-10-16 18:21:15] I see 4656 names which still need to be parsed.
[STOP] [2019-10-16 18:21:20] parse_names
[START] [2019-10-16 18:21:20] denormalize_canonical_names_to_nodes
[STOP] [2019-10-16 18:21:20] denormalize_canonical_names_to_nodes
[START] [2019-10-16 18:21:20] match_nodes
[START] [2019-10-16 18:21:20] map_all_nodes_to_pages
[STOP] [2019-10-16 18:26:10] map_all_nodes_to_pages
[INFO] [2019-10-16 18:26:10] 291 Unmatched nodes (of 4656)! That's too many to output. First 10: Thalaseus (#52500464); Thalaseus bergii (#52500463); Procelsterna cerulea (#52501517); Tringa incanus (#52503908); Puffinus pacificus (#52500053); Puffinus tenuirostris (#52500929); Puffinus griseus (#52501129); Ostorhinchus cyanosomus (#52504108); Nectamia fuscus (#52504583); Ptereleotridae (#52501493)
[START] [2019-10-16 18:26:10] update_nodes
[STOP] [2019-10-16 18:26:11] update_nodes
[STOP] [2019-10-16 18:26:11] match_nodes
[START] [2019-10-16 18:26:11] reindex_search
[STOP] [2019-10-16 18:26:22] reindex_search
[START] [2019-10-16 18:26:22] normalize_units
[STOP] [2019-10-16 18:26:22] normalize_units
[START] [2019-10-16 18:26:22] calculate_statistics
[STOP] [2019-10-16 18:26:22] calculate_statistics
[START] [2019-10-16 18:26:22] complete_harvest_instance
[START] [2019-10-16 18:26:22] overall_tsv_creation
[INFO] [2019-10-16 18:26:22] Processing group of 4656 in 1 batches of 10000
[INFO] [2019-10-16 18:27:24] 2410 Traits (unfiltered)...
[INFO] [2019-10-16 18:27:37] 2410 Traits (filtered)...
[INFO] [2019-10-16 18:27:37] 0 Associations (filtered)...
[INFO] [2019-10-16 18:28:19] 12050 metadata added.
[INFO] [2019-10-16 18:28:19] 0 metadata added.
[INFO] [2019-10-16 18:28:19] Average Time: 92.99
[INFO] [2019-10-16 18:28:19] Total Time: 1m57s
[STOP] [2019-10-16 18:28:19] overall_tsv_creation
[INFO] [2019-10-16 18:28:19] Done. Check your files:
[INFO] [2019-10-16 18:28:19] (4656 lines) /app/public/data/tonga_sp_list/publish_nodes.tsv
[INFO] [2019-10-16 18:28:19] (9565 lines) /app/public/data/tonga_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-16 18:28:19] (4656 lines) /app/public/data/tonga_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-16 18:28:19] (2411 lines) /app/public/data/tonga_sp_list/publish_traits.tsv
[INFO] [2019-10-16 18:28:19] (12051 lines) /app/public/data/tonga_sp_list/publish_metadata.tsv
[STOP] [2019-10-16 18:28:19] complete_harvest_instance
[START] [2019-10-16 18:28:19] completed
[STOP] [2019-10-16 18:28:19] completed
[STOP] [2019-10-16 18:28:19] logged process, took 489.38

Latest Process