Harvest for Palau Species List Created 15 Oct 06:10

Stage: completed
Fetched: 15 Oct 06:10
Validated: 15 Oct 06:10
Deltas Created 15 Oct 06:10
Units Normalized: 15 Oct 06:18
Ancestry Built: 15 Oct 06:11
Nodes Matched: 15 Oct 06:17
Names Parsed: 15 Oct 06:11
New Models Stored: 15 Oct 06:10
Indexed: 15 Oct 06:18
Completed: 15 Oct 06:20
Time to Harvest: less than a minute

Harvesting Log

(149 lines)
# Logfile created on 2019-10-15 06:10:07 -0400 by logger.rb/56815
[START] [2019-10-15 06:10:07] logged process
[START] [2019-10-15 06:10:07] create_harvest_instance
[STOP] [2019-10-15 06:10:07] create_harvest_instance
[START] [2019-10-15 06:10:07] fetch_files
[STOP] [2019-10-15 06:10:07] fetch_files
[START] [2019-10-15 06:10:07] validate_each_file
[STOP] [2019-10-15 06:10:08] validate_each_file
[START] [2019-10-15 06:10:08] convert_to_csv
[CMD] [2019-10-15 06:10:08] /usr/bin/sort /app/public/converted_csv/palau_sp_list_refs_16853.csv > /app/public/converted_csv/palau_sp_list_refs_16853.csv_sorted
[CMD] [2019-10-15 06:10:09] /usr/bin/sort /app/public/converted_csv/palau_sp_list_nodes_16854.csv > /app/public/converted_csv/palau_sp_list_nodes_16854.csv_sorted
[CMD] [2019-10-15 06:10:09] /usr/bin/sort /app/public/converted_csv/palau_sp_list_occurrences_16855.csv > /app/public/converted_csv/palau_sp_list_occurrences_16855.csv_sorted
[CMD] [2019-10-15 06:10:09] /usr/bin/sort /app/public/converted_csv/palau_sp_list_measurements_16856.csv > /app/public/converted_csv/palau_sp_list_measurements_16856.csv_sorted
[STOP] [2019-10-15 06:10:09] convert_to_csv
[START] [2019-10-15 06:10:09] calculate_delta
[CMD] [2019-10-15 06:10:09] echo "0a" > /app/public/diff/palau_sp_list_refs_16853.diff
[CMD] [2019-10-15 06:10:10] tail -n +1 /app/public/converted_csv/palau_sp_list_refs_16853.csv >> /app/public/diff/palau_sp_list_refs_16853.diff
[CMD] [2019-10-15 06:10:10] echo "." >> /app/public/diff/palau_sp_list_refs_16853.diff
[CMD] [2019-10-15 06:10:10] echo "0a" > /app/public/diff/palau_sp_list_nodes_16854.diff
[CMD] [2019-10-15 06:10:11] tail -n +1 /app/public/converted_csv/palau_sp_list_nodes_16854.csv >> /app/public/diff/palau_sp_list_nodes_16854.diff
[CMD] [2019-10-15 06:10:11] echo "." >> /app/public/diff/palau_sp_list_nodes_16854.diff
[CMD] [2019-10-15 06:10:11] echo "0a" > /app/public/diff/palau_sp_list_occurrences_16855.diff
[CMD] [2019-10-15 06:10:11] tail -n +1 /app/public/converted_csv/palau_sp_list_occurrences_16855.csv >> /app/public/diff/palau_sp_list_occurrences_16855.diff
[CMD] [2019-10-15 06:10:12] echo "." >> /app/public/diff/palau_sp_list_occurrences_16855.diff
[CMD] [2019-10-15 06:10:12] echo "0a" > /app/public/diff/palau_sp_list_measurements_16856.diff
[CMD] [2019-10-15 06:10:12] tail -n +1 /app/public/converted_csv/palau_sp_list_measurements_16856.csv >> /app/public/diff/palau_sp_list_measurements_16856.diff
[CMD] [2019-10-15 06:10:13] echo "." >> /app/public/diff/palau_sp_list_measurements_16856.diff
[STOP] [2019-10-15 06:10:13] calculate_delta
[START] [2019-10-15 06:10:13] parse_diff_and_store
[INFO] [2019-10-15 06:10:13] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-15 06:10:13] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-15 06:10:16] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-15 06:10:17] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-15 06:10:40] Storing 2 References
[INFO] [2019-10-15 06:10:40] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-15 06:10:40] Average Time: 0.0
[INFO] [2019-10-15 06:10:40] Total Time: 1s
[INFO] [2019-10-15 06:10:40] Storing 6651 ScientificNames
[INFO] [2019-10-15 06:10:40] Processing group of 6651 in 7 groups of 1000
[INFO] [2019-10-15 06:10:43] Average Time: 0.417
[INFO] [2019-10-15 06:10:43] Total Time: 3s
[INFO] [2019-10-15 06:10:43] last 3 / first 3: 1.04
[INFO] [2019-10-15 06:10:43] Std.Dev: 0.10954451150103323; Max: 0.55
[INFO] [2019-10-15 06:10:43] Storing 6651 Nodes
[INFO] [2019-10-15 06:10:43] Processing group of 6651 in 7 groups of 1000
[INFO] [2019-10-15 06:10:47] Average Time: 0.464
[INFO] [2019-10-15 06:10:47] Total Time: 4s
[INFO] [2019-10-15 06:10:47] last 3 / first 3: 0.4
[INFO] [2019-10-15 06:10:47] Std.Dev: 0.454972526643093; Max: 1.49
[INFO] [2019-10-15 06:10:47] Storing 3856 Occurrences
[INFO] [2019-10-15 06:10:47] Processing group of 3856 in 4 groups of 1000
[INFO] [2019-10-15 06:10:47] Average Time: 0.107
[INFO] [2019-10-15 06:10:47] Total Time: 1s
[INFO] [2019-10-15 06:10:47] Storing 8180 TraitsReferences
[INFO] [2019-10-15 06:10:47] Processing group of 8180 in 9 groups of 1000
[INFO] [2019-10-15 06:10:48] Average Time: 0.073
[INFO] [2019-10-15 06:10:48] Total Time: 1s
[INFO] [2019-10-15 06:10:48] last 3 / first 3: 0.5
[INFO] [2019-10-15 06:10:48] Std.Dev: 0.03162277660168379; Max: 0.16
[INFO] [2019-10-15 06:10:48] Storing 8179 Traits
[INFO] [2019-10-15 06:10:48] Processing group of 8179 in 9 groups of 1000
[INFO] [2019-10-15 06:10:50] Average Time: 0.282
[INFO] [2019-10-15 06:10:50] Total Time: 3s
[INFO] [2019-10-15 06:10:50] last 3 / first 3: 0.69
[INFO] [2019-10-15 06:10:50] Std.Dev: 0.09486832980505137; Max: 0.39
[INFO] [2019-10-15 06:10:50] Storing 8174 MetaTraits
[INFO] [2019-10-15 06:10:50] Processing group of 8174 in 9 groups of 1000
[INFO] [2019-10-15 06:10:51] Average Time: 0.1
[INFO] [2019-10-15 06:10:51] Total Time: 1s
[INFO] [2019-10-15 06:10:51] last 3 / first 3: 0.66
[INFO] [2019-10-15 06:10:51] Std.Dev: 0.03162277660168379; Max: 0.14
[STOP] [2019-10-15 06:10:51] parse_diff_and_store
[START] [2019-10-15 06:10:51] resolve_keys
[INFO] [2019-10-15 06:11:20] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-15 06:11:23] traits to occurrences...
[INFO] [2019-10-15 06:11:27] traits to nodes (through occurrences)...
[INFO] [2019-10-15 06:11:27] Traits to sex term...
[INFO] [2019-10-15 06:11:30] Traits to lifestage term...
[INFO] [2019-10-15 06:11:33] MetaTraits to traits...
[INFO] [2019-10-15 06:11:33] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-15 06:11:34] Assocs to occurrences...
[INFO] [2019-10-15 06:11:34] Assocs to nodes...
[INFO] [2019-10-15 06:11:34] Assoc to sex term...
[INFO] [2019-10-15 06:11:34] Assoc to lifestage term...
[STOP] [2019-10-15 06:11:34] resolve_keys
[START] [2019-10-15 06:11:34] hold_for_later_1
[STOP] [2019-10-15 06:11:34] hold_for_later_1
[START] [2019-10-15 06:11:34] hold_for_later_2
[STOP] [2019-10-15 06:11:34] hold_for_later_2
[START] [2019-10-15 06:11:34] resolve_missing_parents
[STOP] [2019-10-15 06:11:48] resolve_missing_parents
[START] [2019-10-15 06:11:48] rebuild_nodes
[START] [2019-10-15 06:11:48] Flattener#flatten
[START] [2019-10-15 06:11:48] Flattener#study_resource
[START] [2019-10-15 06:11:48] Flattener#build_ancestry
[STOP] [2019-10-15 06:11:48] Flattener#build_ancestry
[INFO] [2019-10-15 06:11:48] 6651 ancestry keys
[START] [2019-10-15 06:11:48] build_node_ancestors
[INFO] [2019-10-15 06:11:48] old ancestors deleted.
[STOP] [2019-10-15 06:11:49] build_node_ancestors
[START] [2019-10-15 06:11:50] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 06:11:50] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 06:11:50] Flattener#flatten
[STOP] [2019-10-15 06:11:50] rebuild_nodes
[START] [2019-10-15 06:11:50] resolve_missing_media_owners
[STOP] [2019-10-15 06:11:50] resolve_missing_media_owners
[START] [2019-10-15 06:11:50] sanitize_media_verbatims
[STOP] [2019-10-15 06:11:50] sanitize_media_verbatims
[START] [2019-10-15 06:11:50] queue_downloads
[STOP] [2019-10-15 06:11:50] queue_downloads
[START] [2019-10-15 06:11:50] parse_names
[WARN] [2019-10-15 06:11:50] I see 6651 names which still need to be parsed.
[STOP] [2019-10-15 06:11:56] parse_names
[START] [2019-10-15 06:11:56] denormalize_canonical_names_to_nodes
[STOP] [2019-10-15 06:11:56] denormalize_canonical_names_to_nodes
[START] [2019-10-15 06:11:56] match_nodes
[START] [2019-10-15 06:11:56] map_all_nodes_to_pages
[STOP] [2019-10-15 06:17:47] map_all_nodes_to_pages
[INFO] [2019-10-15 06:17:47] 321 Unmatched nodes (of 6651)! That's too many to output. First 10: Hydrophiidae (#51411775); Pelamis (#51417396); Strumigenys emmae (#51412001); Cardiocondyla minutior (#51412675); Egretta intermedia (#51411366); Rukia palauensis (#51415162); Pachycephala tenebrosus (#51413708); Coracina tenuirostris (#51413354); Pitohui tenebrosus (#51413465); Thalaseus (#51411216)
[START] [2019-10-15 06:17:47] update_nodes
[STOP] [2019-10-15 06:17:49] update_nodes
[STOP] [2019-10-15 06:17:49] match_nodes
[START] [2019-10-15 06:17:49] reindex_search
[STOP] [2019-10-15 06:18:04] reindex_search
[START] [2019-10-15 06:18:04] normalize_units
[STOP] [2019-10-15 06:18:04] normalize_units
[START] [2019-10-15 06:18:04] calculate_statistics
[STOP] [2019-10-15 06:18:04] calculate_statistics
[START] [2019-10-15 06:18:04] complete_harvest_instance
[START] [2019-10-15 06:18:04] overall_tsv_creation
[INFO] [2019-10-15 06:18:04] Processing group of 6651 in 1 batches of 10000
[INFO] [2019-10-15 06:19:18] 3856 Traits (unfiltered)...
[INFO] [2019-10-15 06:19:32] 3856 Traits (filtered)...
[INFO] [2019-10-15 06:19:32] 0 Associations (filtered)...
[INFO] [2019-10-15 06:20:17] 19274 metadata added.
[INFO] [2019-10-15 06:20:17] 0 metadata added.
[INFO] [2019-10-15 06:20:17] Average Time: 108.43
[INFO] [2019-10-15 06:20:17] Total Time: 2m14s
[STOP] [2019-10-15 06:20:17] overall_tsv_creation
[INFO] [2019-10-15 06:20:17] Done. Check your files:
[INFO] [2019-10-15 06:20:18] (6651 lines) /app/public/data/palau_sp_list/publish_nodes.tsv
[INFO] [2019-10-15 06:20:18] (14363 lines) /app/public/data/palau_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-15 06:20:18] (6651 lines) /app/public/data/palau_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-15 06:20:19] (3857 lines) /app/public/data/palau_sp_list/publish_traits.tsv
[INFO] [2019-10-15 06:20:19] (19275 lines) /app/public/data/palau_sp_list/publish_metadata.tsv
[STOP] [2019-10-15 06:20:19] complete_harvest_instance
[START] [2019-10-15 06:20:19] completed
[STOP] [2019-10-15 06:20:19] completed
[STOP] [2019-10-15 06:20:19] logged process, took 612.18

Latest Process