Harvest for Estonia Species List Created 12 Oct 17:43

Stage: completed
Fetched: 12 Oct 17:43
Validated: 12 Oct 17:43
Deltas Created 12 Oct 17:43
Units Normalized: 12 Oct 18:23
Ancestry Built: 12 Oct 17:48
Nodes Matched: 12 Oct 18:22
Names Parsed: 12 Oct 17:48
New Models Stored: 12 Oct 17:45
Indexed: 12 Oct 18:23
Completed: 12 Oct 18:29
Time to Harvest: 1 minute

Harvesting Log

(161 lines)
# Logfile created on 2019-10-12 17:43:33 -0400 by logger.rb/56815
[START] [2019-10-12 17:43:33] logged process
[START] [2019-10-12 17:43:33] create_harvest_instance
[STOP] [2019-10-12 17:43:34] create_harvest_instance
[START] [2019-10-12 17:43:34] fetch_files
[STOP] [2019-10-12 17:43:34] fetch_files
[START] [2019-10-12 17:43:34] validate_each_file
[STOP] [2019-10-12 17:43:36] validate_each_file
[START] [2019-10-12 17:43:36] convert_to_csv
[CMD] [2019-10-12 17:43:36] /usr/bin/sort /app/public/converted_csv/estonia_sp_list_refs_15683.csv > /app/public/converted_csv/estonia_sp_list_refs_15683.csv_sorted
[CMD] [2019-10-12 17:43:36] /usr/bin/sort /app/public/converted_csv/estonia_sp_list_nodes_15684.csv > /app/public/converted_csv/estonia_sp_list_nodes_15684.csv_sorted
[CMD] [2019-10-12 17:43:37] /usr/bin/sort /app/public/converted_csv/estonia_sp_list_occurrences_15685.csv > /app/public/converted_csv/estonia_sp_list_occurrences_15685.csv_sorted
[CMD] [2019-10-12 17:43:37] /usr/bin/sort /app/public/converted_csv/estonia_sp_list_measurements_15686.csv > /app/public/converted_csv/estonia_sp_list_measurements_15686.csv_sorted
[STOP] [2019-10-12 17:43:37] convert_to_csv
[START] [2019-10-12 17:43:37] calculate_delta
[CMD] [2019-10-12 17:43:37] echo "0a" > /app/public/diff/estonia_sp_list_refs_15683.diff
[CMD] [2019-10-12 17:43:37] tail -n +1 /app/public/converted_csv/estonia_sp_list_refs_15683.csv >> /app/public/diff/estonia_sp_list_refs_15683.diff
[CMD] [2019-10-12 17:43:37] echo "." >> /app/public/diff/estonia_sp_list_refs_15683.diff
[CMD] [2019-10-12 17:43:37] echo "0a" > /app/public/diff/estonia_sp_list_nodes_15684.diff
[CMD] [2019-10-12 17:43:37] tail -n +1 /app/public/converted_csv/estonia_sp_list_nodes_15684.csv >> /app/public/diff/estonia_sp_list_nodes_15684.diff
[CMD] [2019-10-12 17:43:37] echo "." >> /app/public/diff/estonia_sp_list_nodes_15684.diff
[CMD] [2019-10-12 17:43:37] echo "0a" > /app/public/diff/estonia_sp_list_occurrences_15685.diff
[CMD] [2019-10-12 17:43:37] tail -n +1 /app/public/converted_csv/estonia_sp_list_occurrences_15685.csv >> /app/public/diff/estonia_sp_list_occurrences_15685.diff
[CMD] [2019-10-12 17:43:38] echo "." >> /app/public/diff/estonia_sp_list_occurrences_15685.diff
[CMD] [2019-10-12 17:43:38] echo "0a" > /app/public/diff/estonia_sp_list_measurements_15686.diff
[CMD] [2019-10-12 17:43:38] tail -n +1 /app/public/converted_csv/estonia_sp_list_measurements_15686.csv >> /app/public/diff/estonia_sp_list_measurements_15686.diff
[CMD] [2019-10-12 17:43:38] echo "." >> /app/public/diff/estonia_sp_list_measurements_15686.diff
[STOP] [2019-10-12 17:43:38] calculate_delta
[START] [2019-10-12 17:43:38] parse_diff_and_store
[INFO] [2019-10-12 17:43:38] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-12 17:43:38] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-12 17:43:46] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-12 17:43:48] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-12 17:45:04] Storing 2 References
[INFO] [2019-10-12 17:45:04] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-12 17:45:04] Average Time: 0.0
[INFO] [2019-10-12 17:45:04] Total Time: 1s
[INFO] [2019-10-12 17:45:04] Storing 20202 ScientificNames
[INFO] [2019-10-12 17:45:04] Processing group of 20202 in 21 groups of 1000
[INFO] [2019-10-12 17:45:12] Average Time: 0.343
[INFO] [2019-10-12 17:45:12] Total Time: 8s
[INFO] [2019-10-12 17:45:12] last 3 / first 3: 0.62
[INFO] [2019-10-12 17:45:12] Std.Dev: 0.07071067811865475; Max: 0.44
[INFO] [2019-10-12 17:45:12] Storing 20202 Nodes
[INFO] [2019-10-12 17:45:12] Processing group of 20202 in 21 groups of 1000
[INFO] [2019-10-12 17:45:19] Average Time: 0.32
[INFO] [2019-10-12 17:45:19] Total Time: 7s
[INFO] [2019-10-12 17:45:19] last 3 / first 3: 0.73
[INFO] [2019-10-12 17:45:19] Std.Dev: 0.11401754250991379; Max: 0.67
[INFO] [2019-10-12 17:45:19] Storing 13215 Occurrences
[INFO] [2019-10-12 17:45:19] Processing group of 13215 in 14 groups of 1000
[INFO] [2019-10-12 17:45:20] Average Time: 0.1
[INFO] [2019-10-12 17:45:20] Total Time: 2s
[INFO] [2019-10-12 17:45:20] last 3 / first 3: 1.0
[INFO] [2019-10-12 17:45:20] Std.Dev: 0.0; Max: 0.11
[INFO] [2019-10-12 17:45:20] Storing 26430 TraitsReferences
[INFO] [2019-10-12 17:45:20] Processing group of 26430 in 27 groups of 1000
[INFO] [2019-10-12 17:45:22] Average Time: 0.068
[INFO] [2019-10-12 17:45:22] Total Time: 2s
[INFO] [2019-10-12 17:45:22] last 3 / first 3: 0.59
[INFO] [2019-10-12 17:45:22] Std.Dev: 0.0; Max: 0.15
[INFO] [2019-10-12 17:45:22] Storing 26430 Traits
[INFO] [2019-10-12 17:45:22] Processing group of 26430 in 27 groups of 1000
[INFO] [2019-10-12 17:45:31] Average Time: 0.346
[INFO] [2019-10-12 17:45:31] Total Time: 10s
[INFO] [2019-10-12 17:45:31] last 3 / first 3: 1.43
[INFO] [2019-10-12 17:45:31] Std.Dev: 0.13784048752090222; Max: 0.92
[INFO] [2019-10-12 17:45:31] Storing 26407 MetaTraits
[INFO] [2019-10-12 17:45:31] Processing group of 26407 in 27 groups of 1000
[INFO] [2019-10-12 17:45:35] Average Time: 0.126
[INFO] [2019-10-12 17:45:35] Total Time: 4s
[INFO] [2019-10-12 17:45:35] last 3 / first 3: 0.71
[INFO] [2019-10-12 17:45:35] Std.Dev: 0.0; Max: 0.16
[STOP] [2019-10-12 17:45:35] parse_diff_and_store
[START] [2019-10-12 17:45:35] resolve_keys
[INFO] [2019-10-12 17:46:39] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-12 17:46:44] traits to occurrences...
[INFO] [2019-10-12 17:46:48] traits to nodes (through occurrences)...
[INFO] [2019-10-12 17:46:49] Traits to sex term...
[INFO] [2019-10-12 17:46:53] Traits to lifestage term...
[INFO] [2019-10-12 17:46:57] MetaTraits to traits...
[INFO] [2019-10-12 17:46:59] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-12 17:47:03] Assocs to occurrences...
[INFO] [2019-10-12 17:47:03] Assocs to nodes...
[INFO] [2019-10-12 17:47:03] Assoc to sex term...
[INFO] [2019-10-12 17:47:03] Assoc to lifestage term...
[STOP] [2019-10-12 17:47:03] resolve_keys
[START] [2019-10-12 17:47:03] hold_for_later_1
[STOP] [2019-10-12 17:47:03] hold_for_later_1
[START] [2019-10-12 17:47:03] hold_for_later_2
[STOP] [2019-10-12 17:47:03] hold_for_later_2
[START] [2019-10-12 17:47:03] resolve_missing_parents
[STOP] [2019-10-12 17:47:41] resolve_missing_parents
[START] [2019-10-12 17:47:41] rebuild_nodes
[START] [2019-10-12 17:47:41] Flattener#flatten
[START] [2019-10-12 17:47:41] Flattener#study_resource
[START] [2019-10-12 17:47:41] Flattener#build_ancestry
[STOP] [2019-10-12 17:47:43] Flattener#build_ancestry
[INFO] [2019-10-12 17:47:43] 20202 ancestry keys
[START] [2019-10-12 17:47:43] build_node_ancestors
[INFO] [2019-10-12 17:47:43] old ancestors deleted.
[STOP] [2019-10-12 17:47:58] build_node_ancestors
[START] [2019-10-12 17:47:59] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 17:48:01] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 17:48:01] Flattener#flatten
[STOP] [2019-10-12 17:48:01] rebuild_nodes
[START] [2019-10-12 17:48:01] resolve_missing_media_owners
[STOP] [2019-10-12 17:48:01] resolve_missing_media_owners
[START] [2019-10-12 17:48:01] sanitize_media_verbatims
[STOP] [2019-10-12 17:48:01] sanitize_media_verbatims
[START] [2019-10-12 17:48:01] queue_downloads
[STOP] [2019-10-12 17:48:01] queue_downloads
[START] [2019-10-12 17:48:01] parse_names
[WARN] [2019-10-12 17:48:01] I see 20202 names which still need to be parsed.
[STOP] [2019-10-12 17:48:17] parse_names
[START] [2019-10-12 17:48:17] denormalize_canonical_names_to_nodes
[STOP] [2019-10-12 17:48:17] denormalize_canonical_names_to_nodes
[START] [2019-10-12 17:48:17] match_nodes
[START] [2019-10-12 17:48:17] map_all_nodes_to_pages
[STOP] [2019-10-12 18:21:55] map_all_nodes_to_pages
[INFO] [2019-10-12 18:21:55] 2241 Unmatched nodes (of 20202)! That's too many to output. First 10: Hieraaetus pennata (#49575139); Carduelis chloris (#49557583); Carduelis spinus (#49557591); Carduelis cannabina (#49557686); Carduelis flammea (#49560957); Carduelis hornemanni (#49574309); Parus caeruleus (#49557597); Parus montanus (#49557615); Parus palustris (#49557655); Parus cristatus (#49557664)
[START] [2019-10-12 18:21:55] update_nodes
[STOP] [2019-10-12 18:22:02] update_nodes
[STOP] [2019-10-12 18:22:02] match_nodes
[START] [2019-10-12 18:22:02] reindex_search
[STOP] [2019-10-12 18:23:06] reindex_search
[START] [2019-10-12 18:23:06] normalize_units
[STOP] [2019-10-12 18:23:06] normalize_units
[START] [2019-10-12 18:23:06] calculate_statistics
[STOP] [2019-10-12 18:23:06] calculate_statistics
[START] [2019-10-12 18:23:06] complete_harvest_instance
[START] [2019-10-12 18:23:06] overall_tsv_creation
[INFO] [2019-10-12 18:23:06] Processing group of 20202 in 3 batches of 10000
[INFO] [2019-10-12 18:24:34] 5839 Traits (unfiltered)...
[INFO] [2019-10-12 18:24:48] 5839 Traits (filtered)...
[INFO] [2019-10-12 18:24:48] 0 Associations (filtered)...
[INFO] [2019-10-12 18:25:36] 29188 metadata added.
[INFO] [2019-10-12 18:25:36] 0 metadata added.
[INFO] [2019-10-12 18:27:10] 7225 Traits (unfiltered)...
[INFO] [2019-10-12 18:27:24] 7225 Traits (filtered)...
[INFO] [2019-10-12 18:27:25] 0 Associations (filtered)...
[INFO] [2019-10-12 18:28:18] 36111 metadata added.
[INFO] [2019-10-12 18:28:18] 0 metadata added.
[INFO] [2019-10-12 18:29:03] 151 Traits (unfiltered)...
[INFO] [2019-10-12 18:29:17] 151 Traits (filtered)...
[INFO] [2019-10-12 18:29:17] 0 Associations (filtered)...
[INFO] [2019-10-12 18:29:53] 753 metadata added.
[INFO] [2019-10-12 18:29:53] 0 metadata added.
[INFO] [2019-10-12 18:29:53] Average Time: 110.573
[INFO] [2019-10-12 18:29:53] Total Time: 6m48s
[STOP] [2019-10-12 18:29:53] overall_tsv_creation
[INFO] [2019-10-12 18:29:53] Done. Check your files:
[INFO] [2019-10-12 18:29:53] (20202 lines) /app/public/data/estonia_sp_list/publish_nodes.tsv
[INFO] [2019-10-12 18:29:53] (107383 lines) /app/public/data/estonia_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-12 18:29:54] (20202 lines) /app/public/data/estonia_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-12 18:29:54] (13216 lines) /app/public/data/estonia_sp_list/publish_traits.tsv
[INFO] [2019-10-12 18:29:54] (66053 lines) /app/public/data/estonia_sp_list/publish_metadata.tsv
[STOP] [2019-10-12 18:29:54] complete_harvest_instance
[START] [2019-10-12 18:29:54] completed
[STOP] [2019-10-12 18:29:54] completed
[STOP] [2019-10-12 18:29:54] logged process, took 2780.75

Latest Process