Harvest for Russia Species List Created 15 Oct 14:46

Stage: completed
Fetched: 15 Oct 14:46
Validated: 15 Oct 14:46
Deltas Created 15 Oct 14:46
Units Normalized: 15 Oct 15:53
Ancestry Built: 15 Oct 14:53
Nodes Matched: 15 Oct 15:51
Names Parsed: 15 Oct 14:54
New Models Stored: 15 Oct 14:49
Indexed: 15 Oct 15:53
Completed: 15 Oct 16:03
Time to Harvest: 1 minute

Harvesting Log

(166 lines)
# Logfile created on 2019-10-15 14:46:19 -0400 by logger.rb/56815
[START] [2019-10-15 14:46:19] logged process
[START] [2019-10-15 14:46:19] create_harvest_instance
[STOP] [2019-10-15 14:46:20] create_harvest_instance
[START] [2019-10-15 14:46:20] fetch_files
[STOP] [2019-10-15 14:46:20] fetch_files
[START] [2019-10-15 14:46:20] validate_each_file
[STOP] [2019-10-15 14:46:24] validate_each_file
[START] [2019-10-15 14:46:24] convert_to_csv
[CMD] [2019-10-15 14:46:24] /usr/bin/sort /app/public/converted_csv/russia_sp_list_refs_17005.csv > /app/public/converted_csv/russia_sp_list_refs_17005.csv_sorted
[CMD] [2019-10-15 14:46:24] /usr/bin/sort /app/public/converted_csv/russia_sp_list_nodes_17006.csv > /app/public/converted_csv/russia_sp_list_nodes_17006.csv_sorted
[CMD] [2019-10-15 14:46:24] /usr/bin/sort /app/public/converted_csv/russia_sp_list_occurrences_17007.csv > /app/public/converted_csv/russia_sp_list_occurrences_17007.csv_sorted
[CMD] [2019-10-15 14:46:25] /usr/bin/sort /app/public/converted_csv/russia_sp_list_measurements_17008.csv > /app/public/converted_csv/russia_sp_list_measurements_17008.csv_sorted
[STOP] [2019-10-15 14:46:25] convert_to_csv
[START] [2019-10-15 14:46:25] calculate_delta
[CMD] [2019-10-15 14:46:25] echo "0a" > /app/public/diff/russia_sp_list_refs_17005.diff
[CMD] [2019-10-15 14:46:25] tail -n +1 /app/public/converted_csv/russia_sp_list_refs_17005.csv >> /app/public/diff/russia_sp_list_refs_17005.diff
[CMD] [2019-10-15 14:46:26] echo "." >> /app/public/diff/russia_sp_list_refs_17005.diff
[CMD] [2019-10-15 14:46:26] echo "0a" > /app/public/diff/russia_sp_list_nodes_17006.diff
[CMD] [2019-10-15 14:46:26] tail -n +1 /app/public/converted_csv/russia_sp_list_nodes_17006.csv >> /app/public/diff/russia_sp_list_nodes_17006.diff
[CMD] [2019-10-15 14:46:26] echo "." >> /app/public/diff/russia_sp_list_nodes_17006.diff
[CMD] [2019-10-15 14:46:27] echo "0a" > /app/public/diff/russia_sp_list_occurrences_17007.diff
[CMD] [2019-10-15 14:46:27] tail -n +1 /app/public/converted_csv/russia_sp_list_occurrences_17007.csv >> /app/public/diff/russia_sp_list_occurrences_17007.diff
[CMD] [2019-10-15 14:46:27] echo "." >> /app/public/diff/russia_sp_list_occurrences_17007.diff
[CMD] [2019-10-15 14:46:28] echo "0a" > /app/public/diff/russia_sp_list_measurements_17008.diff
[CMD] [2019-10-15 14:46:28] tail -n +1 /app/public/converted_csv/russia_sp_list_measurements_17008.csv >> /app/public/diff/russia_sp_list_measurements_17008.diff
[CMD] [2019-10-15 14:46:28] echo "." >> /app/public/diff/russia_sp_list_measurements_17008.diff
[STOP] [2019-10-15 14:46:28] calculate_delta
[START] [2019-10-15 14:46:28] parse_diff_and_store
[INFO] [2019-10-15 14:46:29] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-15 14:46:29] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-15 14:46:43] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-15 14:46:47] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-15 14:49:00] Storing 2 References
[INFO] [2019-10-15 14:49:00] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-15 14:49:00] Average Time: 0.0
[INFO] [2019-10-15 14:49:00] Total Time: 1s
[INFO] [2019-10-15 14:49:00] Storing 34942 ScientificNames
[INFO] [2019-10-15 14:49:00] Processing group of 34942 in 35 groups of 1000
[INFO] [2019-10-15 14:49:14] Average Time: 0.398
[INFO] [2019-10-15 14:49:14] Total Time: 15s
[INFO] [2019-10-15 14:49:14] last 3 / first 3: 1.47
[INFO] [2019-10-15 14:49:14] Std.Dev: 0.1; Max: 0.71
[INFO] [2019-10-15 14:49:14] Storing 34942 Nodes
[INFO] [2019-10-15 14:49:14] Processing group of 34942 in 35 groups of 1000
[INFO] [2019-10-15 14:49:25] Average Time: 0.301
[INFO] [2019-10-15 14:49:25] Total Time: 11s
[INFO] [2019-10-15 14:49:25] last 3 / first 3: 0.98
[INFO] [2019-10-15 14:49:25] Std.Dev: 0.03162277660168379; Max: 0.4
[INFO] [2019-10-15 14:49:25] Storing 23049 Occurrences
[INFO] [2019-10-15 14:49:25] Processing group of 23049 in 24 groups of 1000
[INFO] [2019-10-15 14:49:28] Average Time: 0.103
[INFO] [2019-10-15 14:49:28] Total Time: 3s
[INFO] [2019-10-15 14:49:28] last 3 / first 3: 0.57
[INFO] [2019-10-15 14:49:28] Std.Dev: 0.03162277660168379; Max: 0.17
[INFO] [2019-10-15 14:49:28] Storing 46098 TraitsReferences
[INFO] [2019-10-15 14:49:28] Processing group of 46098 in 47 groups of 1000
[INFO] [2019-10-15 14:49:31] Average Time: 0.071
[INFO] [2019-10-15 14:49:31] Total Time: 4s
[INFO] [2019-10-15 14:49:31] last 3 / first 3: 0.56
[INFO] [2019-10-15 14:49:31] Std.Dev: 0.03162277660168379; Max: 0.19
[INFO] [2019-10-15 14:49:31] Storing 46098 Traits
[INFO] [2019-10-15 14:49:31] Processing group of 46098 in 47 groups of 1000
[INFO] [2019-10-15 14:49:48] Average Time: 0.342
[INFO] [2019-10-15 14:49:48] Total Time: 17s
[INFO] [2019-10-15 14:49:48] last 3 / first 3: 0.54
[INFO] [2019-10-15 14:49:48] Std.Dev: 0.16431676725154984; Max: 1.35
[INFO] [2019-10-15 14:49:48] Storing 46062 MetaTraits
[INFO] [2019-10-15 14:49:48] Processing group of 46062 in 47 groups of 1000
[INFO] [2019-10-15 14:49:55] Average Time: 0.155
[INFO] [2019-10-15 14:49:55] Total Time: 8s
[INFO] [2019-10-15 14:49:55] last 3 / first 3: 0.61
[INFO] [2019-10-15 14:49:55] Std.Dev: 0.1414213562373095; Max: 1.03
[STOP] [2019-10-15 14:49:55] parse_diff_and_store
[START] [2019-10-15 14:49:55] resolve_keys
[INFO] [2019-10-15 14:51:37] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-15 14:51:44] traits to occurrences...
[INFO] [2019-10-15 14:51:50] traits to nodes (through occurrences)...
[INFO] [2019-10-15 14:51:50] Traits to sex term...
[INFO] [2019-10-15 14:51:56] Traits to lifestage term...
[INFO] [2019-10-15 14:52:02] MetaTraits to traits...
[INFO] [2019-10-15 14:52:05] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-15 14:52:11] Assocs to occurrences...
[INFO] [2019-10-15 14:52:11] Assocs to nodes...
[INFO] [2019-10-15 14:52:11] Assoc to sex term...
[INFO] [2019-10-15 14:52:11] Assoc to lifestage term...
[STOP] [2019-10-15 14:52:11] resolve_keys
[START] [2019-10-15 14:52:11] hold_for_later_1
[STOP] [2019-10-15 14:52:11] hold_for_later_1
[START] [2019-10-15 14:52:11] hold_for_later_2
[STOP] [2019-10-15 14:52:11] hold_for_later_2
[START] [2019-10-15 14:52:11] resolve_missing_parents
[STOP] [2019-10-15 14:53:05] resolve_missing_parents
[START] [2019-10-15 14:53:05] rebuild_nodes
[START] [2019-10-15 14:53:05] Flattener#flatten
[START] [2019-10-15 14:53:05] Flattener#study_resource
[START] [2019-10-15 14:53:05] Flattener#build_ancestry
[STOP] [2019-10-15 14:53:10] Flattener#build_ancestry
[INFO] [2019-10-15 14:53:10] 34942 ancestry keys
[START] [2019-10-15 14:53:10] build_node_ancestors
[INFO] [2019-10-15 14:53:10] old ancestors deleted.
[STOP] [2019-10-15 14:53:34] build_node_ancestors
[START] [2019-10-15 14:53:40] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 14:53:45] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 14:53:45] Flattener#flatten
[STOP] [2019-10-15 14:53:45] rebuild_nodes
[START] [2019-10-15 14:53:45] resolve_missing_media_owners
[STOP] [2019-10-15 14:53:45] resolve_missing_media_owners
[START] [2019-10-15 14:53:45] sanitize_media_verbatims
[STOP] [2019-10-15 14:53:45] sanitize_media_verbatims
[START] [2019-10-15 14:53:45] queue_downloads
[STOP] [2019-10-15 14:53:45] queue_downloads
[START] [2019-10-15 14:53:45] parse_names
[WARN] [2019-10-15 14:53:45] I see 34942 names which still need to be parsed.
[STOP] [2019-10-15 14:54:13] parse_names
[START] [2019-10-15 14:54:13] denormalize_canonical_names_to_nodes
[STOP] [2019-10-15 14:54:13] denormalize_canonical_names_to_nodes
[START] [2019-10-15 14:54:13] match_nodes
[START] [2019-10-15 14:54:13] map_all_nodes_to_pages
[STOP] [2019-10-15 15:51:11] map_all_nodes_to_pages
[INFO] [2019-10-15 15:51:11] 3497 Unmatched nodes (of 34942)! That's too many to output. First 10: Myotis aurascens (#51736294); Cricetodon (#51738472); Spermophilinus (#51744212); Spermophilinus bredai (#51744211); Trogontherium (#51726402); Trigontherium (#51738154); Trigontherium minus (#51738153); Homo erectus (#51732232); Canis variabilis (#51744663); Pagophilus groenlandica (#51717949)
[START] [2019-10-15 15:51:11] update_nodes
[STOP] [2019-10-15 15:51:24] update_nodes
[STOP] [2019-10-15 15:51:24] match_nodes
[START] [2019-10-15 15:51:24] reindex_search
[STOP] [2019-10-15 15:53:10] reindex_search
[START] [2019-10-15 15:53:10] normalize_units
[STOP] [2019-10-15 15:53:10] normalize_units
[START] [2019-10-15 15:53:10] calculate_statistics
[STOP] [2019-10-15 15:53:10] calculate_statistics
[START] [2019-10-15 15:53:10] complete_harvest_instance
[START] [2019-10-15 15:53:10] overall_tsv_creation
[INFO] [2019-10-15 15:53:10] Processing group of 34942 in 4 batches of 10000
[INFO] [2019-10-15 15:54:39] 5844 Traits (unfiltered)...
[INFO] [2019-10-15 15:54:53] 5844 Traits (filtered)...
[INFO] [2019-10-15 15:54:53] 0 Associations (filtered)...
[INFO] [2019-10-15 15:55:42] 29219 metadata added.
[INFO] [2019-10-15 15:55:42] 0 metadata added.
[INFO] [2019-10-15 15:57:18] 6650 Traits (unfiltered)...
[INFO] [2019-10-15 15:57:31] 6650 Traits (filtered)...
[INFO] [2019-10-15 15:57:31] 0 Associations (filtered)...
[INFO] [2019-10-15 15:58:26] 33244 metadata added.
[INFO] [2019-10-15 15:58:26] 0 metadata added.
[INFO] [2019-10-15 16:00:04] 6933 Traits (unfiltered)...
[INFO] [2019-10-15 16:00:17] 6933 Traits (filtered)...
[INFO] [2019-10-15 16:00:17] 0 Associations (filtered)...
[INFO] [2019-10-15 16:01:11] 34650 metadata added.
[INFO] [2019-10-15 16:01:11] 0 metadata added.
[INFO] [2019-10-15 16:02:21] 3622 Traits (unfiltered)...
[INFO] [2019-10-15 16:02:35] 3622 Traits (filtered)...
[INFO] [2019-10-15 16:02:35] 0 Associations (filtered)...
[INFO] [2019-10-15 16:03:21] 18096 metadata added.
[INFO] [2019-10-15 16:03:21] 0 metadata added.
[INFO] [2019-10-15 16:03:21] Average Time: 126.32
[INFO] [2019-10-15 16:03:21] Total Time: 10m12s
[STOP] [2019-10-15 16:03:21] overall_tsv_creation
[INFO] [2019-10-15 16:03:21] Done. Check your files:
[INFO] [2019-10-15 16:03:21] (34942 lines) /app/public/data/russia_sp_list/publish_nodes.tsv
[INFO] [2019-10-15 16:03:22] (188882 lines) /app/public/data/russia_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-15 16:03:22] (34942 lines) /app/public/data/russia_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-15 16:03:22] (23050 lines) /app/public/data/russia_sp_list/publish_traits.tsv
[INFO] [2019-10-15 16:03:23] (115210 lines) /app/public/data/russia_sp_list/publish_metadata.tsv
[STOP] [2019-10-15 16:03:23] complete_harvest_instance
[START] [2019-10-15 16:03:23] completed
[STOP] [2019-10-15 16:03:23] completed
[STOP] [2019-10-15 16:03:23] logged process, took 4623.5

Latest Process