Harvest for Greenland Species List Created 13 Oct 05:41

Stage: completed
Fetched: 13 Oct 05:41
Validated: 13 Oct 05:41
Deltas Created 13 Oct 05:41
Units Normalized: 13 Oct 05:52
Ancestry Built: 13 Oct 05:42
Nodes Matched: 13 Oct 05:52
Names Parsed: 13 Oct 05:42
New Models Stored: 13 Oct 05:41
Indexed: 13 Oct 05:52
Completed: 13 Oct 05:54
Time to Harvest: less than a minute

Harvesting Log

(145 lines)
# Logfile created on 2019-10-13 05:41:13 -0400 by logger.rb/56815
[START] [2019-10-13 05:41:13] logged process
[START] [2019-10-13 05:41:13] create_harvest_instance
[STOP] [2019-10-13 05:41:14] create_harvest_instance
[START] [2019-10-13 05:41:14] fetch_files
[STOP] [2019-10-13 05:41:14] fetch_files
[START] [2019-10-13 05:41:14] validate_each_file
[STOP] [2019-10-13 05:41:15] validate_each_file
[START] [2019-10-13 05:41:15] convert_to_csv
[CMD] [2019-10-13 05:41:15] /usr/bin/sort /app/public/converted_csv/greenland_sp_lis_refs_15883.csv > /app/public/converted_csv/greenland_sp_lis_refs_15883.csv_sorted
[CMD] [2019-10-13 05:41:15] /usr/bin/sort /app/public/converted_csv/greenland_sp_lis_nodes_15884.csv > /app/public/converted_csv/greenland_sp_lis_nodes_15884.csv_sorted
[CMD] [2019-10-13 05:41:15] /usr/bin/sort /app/public/converted_csv/greenland_sp_lis_occurrences_15885.csv > /app/public/converted_csv/greenland_sp_lis_occurrences_15885.csv_sorted
[CMD] [2019-10-13 05:41:15] /usr/bin/sort /app/public/converted_csv/greenland_sp_lis_measurements_15886.csv > /app/public/converted_csv/greenland_sp_lis_measurements_15886.csv_sorted
[STOP] [2019-10-13 05:41:15] convert_to_csv
[START] [2019-10-13 05:41:15] calculate_delta
[CMD] [2019-10-13 05:41:15] echo "0a" > /app/public/diff/greenland_sp_lis_refs_15883.diff
[CMD] [2019-10-13 05:41:15] tail -n +1 /app/public/converted_csv/greenland_sp_lis_refs_15883.csv >> /app/public/diff/greenland_sp_lis_refs_15883.diff
[CMD] [2019-10-13 05:41:15] echo "." >> /app/public/diff/greenland_sp_lis_refs_15883.diff
[CMD] [2019-10-13 05:41:15] echo "0a" > /app/public/diff/greenland_sp_lis_nodes_15884.diff
[CMD] [2019-10-13 05:41:16] tail -n +1 /app/public/converted_csv/greenland_sp_lis_nodes_15884.csv >> /app/public/diff/greenland_sp_lis_nodes_15884.diff
[CMD] [2019-10-13 05:41:16] echo "." >> /app/public/diff/greenland_sp_lis_nodes_15884.diff
[CMD] [2019-10-13 05:41:16] echo "0a" > /app/public/diff/greenland_sp_lis_occurrences_15885.diff
[CMD] [2019-10-13 05:41:16] tail -n +1 /app/public/converted_csv/greenland_sp_lis_occurrences_15885.csv >> /app/public/diff/greenland_sp_lis_occurrences_15885.diff
[CMD] [2019-10-13 05:41:16] echo "." >> /app/public/diff/greenland_sp_lis_occurrences_15885.diff
[CMD] [2019-10-13 05:41:16] echo "0a" > /app/public/diff/greenland_sp_lis_measurements_15886.diff
[CMD] [2019-10-13 05:41:16] tail -n +1 /app/public/converted_csv/greenland_sp_lis_measurements_15886.csv >> /app/public/diff/greenland_sp_lis_measurements_15886.diff
[CMD] [2019-10-13 05:41:16] echo "." >> /app/public/diff/greenland_sp_lis_measurements_15886.diff
[STOP] [2019-10-13 05:41:16] calculate_delta
[START] [2019-10-13 05:41:16] parse_diff_and_store
[INFO] [2019-10-13 05:41:16] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-13 05:41:16] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-13 05:41:19] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-13 05:41:19] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-13 05:41:38] Storing 2 References
[INFO] [2019-10-13 05:41:38] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-13 05:41:38] Average Time: 0.0
[INFO] [2019-10-13 05:41:38] Total Time: 1s
[INFO] [2019-10-13 05:41:38] Storing 5885 ScientificNames
[INFO] [2019-10-13 05:41:38] Processing group of 5885 in 6 groups of 1000
[INFO] [2019-10-13 05:41:40] Average Time: 0.368
[INFO] [2019-10-13 05:41:40] Total Time: 3s
[INFO] [2019-10-13 05:41:40] Storing 5885 Nodes
[INFO] [2019-10-13 05:41:40] Processing group of 5885 in 6 groups of 1000
[INFO] [2019-10-13 05:41:42] Average Time: 0.292
[INFO] [2019-10-13 05:41:42] Total Time: 2s
[INFO] [2019-10-13 05:41:42] Storing 3086 Occurrences
[INFO] [2019-10-13 05:41:42] Processing group of 3086 in 4 groups of 1000
[INFO] [2019-10-13 05:41:42] Average Time: 0.09
[INFO] [2019-10-13 05:41:42] Total Time: 1s
[INFO] [2019-10-13 05:41:42] Storing 6172 TraitsReferences
[INFO] [2019-10-13 05:41:42] Processing group of 6172 in 7 groups of 1000
[INFO] [2019-10-13 05:41:43] Average Time: 0.073
[INFO] [2019-10-13 05:41:43] Total Time: 1s
[INFO] [2019-10-13 05:41:43] last 3 / first 3: 0.47
[INFO] [2019-10-13 05:41:43] Std.Dev: 0.044721359549995794; Max: 0.16
[INFO] [2019-10-13 05:41:43] Storing 6172 Traits
[INFO] [2019-10-13 05:41:43] Processing group of 6172 in 7 groups of 1000
[INFO] [2019-10-13 05:41:45] Average Time: 0.28
[INFO] [2019-10-13 05:41:45] Total Time: 3s
[INFO] [2019-10-13 05:41:45] last 3 / first 3: 0.64
[INFO] [2019-10-13 05:41:45] Std.Dev: 0.10954451150103323; Max: 0.4
[INFO] [2019-10-13 05:41:45] Storing 6166 MetaTraits
[INFO] [2019-10-13 05:41:45] Processing group of 6166 in 7 groups of 1000
[INFO] [2019-10-13 05:41:46] Average Time: 0.109
[INFO] [2019-10-13 05:41:46] Total Time: 1s
[INFO] [2019-10-13 05:41:46] last 3 / first 3: 0.76
[INFO] [2019-10-13 05:41:46] Std.Dev: 0.044721359549995794; Max: 0.15
[STOP] [2019-10-13 05:41:46] parse_diff_and_store
[START] [2019-10-13 05:41:46] resolve_keys
[INFO] [2019-10-13 05:42:09] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-13 05:42:12] traits to occurrences...
[INFO] [2019-10-13 05:42:13] traits to nodes (through occurrences)...
[INFO] [2019-10-13 05:42:13] Traits to sex term...
[INFO] [2019-10-13 05:42:14] Traits to lifestage term...
[INFO] [2019-10-13 05:42:16] MetaTraits to traits...
[INFO] [2019-10-13 05:42:16] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-13 05:42:17] Assocs to occurrences...
[INFO] [2019-10-13 05:42:17] Assocs to nodes...
[INFO] [2019-10-13 05:42:17] Assoc to sex term...
[INFO] [2019-10-13 05:42:17] Assoc to lifestage term...
[STOP] [2019-10-13 05:42:17] resolve_keys
[START] [2019-10-13 05:42:17] hold_for_later_1
[STOP] [2019-10-13 05:42:17] hold_for_later_1
[START] [2019-10-13 05:42:17] hold_for_later_2
[STOP] [2019-10-13 05:42:17] hold_for_later_2
[START] [2019-10-13 05:42:17] resolve_missing_parents
[STOP] [2019-10-13 05:42:28] resolve_missing_parents
[START] [2019-10-13 05:42:28] rebuild_nodes
[START] [2019-10-13 05:42:28] Flattener#flatten
[START] [2019-10-13 05:42:28] Flattener#study_resource
[START] [2019-10-13 05:42:28] Flattener#build_ancestry
[STOP] [2019-10-13 05:42:28] Flattener#build_ancestry
[INFO] [2019-10-13 05:42:28] 5885 ancestry keys
[START] [2019-10-13 05:42:28] build_node_ancestors
[INFO] [2019-10-13 05:42:28] old ancestors deleted.
[STOP] [2019-10-13 05:42:31] build_node_ancestors
[START] [2019-10-13 05:42:33] Flattener#propagate_ancestor_ids
[STOP] [2019-10-13 05:42:34] Flattener#propagate_ancestor_ids
[STOP] [2019-10-13 05:42:34] Flattener#flatten
[STOP] [2019-10-13 05:42:34] rebuild_nodes
[START] [2019-10-13 05:42:34] resolve_missing_media_owners
[STOP] [2019-10-13 05:42:34] resolve_missing_media_owners
[START] [2019-10-13 05:42:34] sanitize_media_verbatims
[STOP] [2019-10-13 05:42:34] sanitize_media_verbatims
[START] [2019-10-13 05:42:34] queue_downloads
[STOP] [2019-10-13 05:42:34] queue_downloads
[START] [2019-10-13 05:42:34] parse_names
[WARN] [2019-10-13 05:42:34] I see 5885 names which still need to be parsed.
[STOP] [2019-10-13 05:42:39] parse_names
[START] [2019-10-13 05:42:39] denormalize_canonical_names_to_nodes
[STOP] [2019-10-13 05:42:39] denormalize_canonical_names_to_nodes
[START] [2019-10-13 05:42:39] match_nodes
[START] [2019-10-13 05:42:39] map_all_nodes_to_pages
[STOP] [2019-10-13 05:52:02] map_all_nodes_to_pages
[INFO] [2019-10-13 05:52:02] 644 Unmatched nodes (of 5885)! That's too many to output. First 10: Larus thayeri (#49936144); Carduelis flammea (#49932989); Carduelis hornemanni (#49935204); Chen (#49936329); Chen caerulescens (#49936328); Lagopus mutus (#49934064); Morus bassana (#49937693); Bubo scandiaca (#49936092); Pagophilus groenlandica (#49934119); Gasterosteiformes (#49932276)
[START] [2019-10-13 05:52:02] update_nodes
[STOP] [2019-10-13 05:52:04] update_nodes
[STOP] [2019-10-13 05:52:04] match_nodes
[START] [2019-10-13 05:52:04] reindex_search
[STOP] [2019-10-13 05:52:22] reindex_search
[START] [2019-10-13 05:52:22] normalize_units
[STOP] [2019-10-13 05:52:22] normalize_units
[START] [2019-10-13 05:52:22] calculate_statistics
[STOP] [2019-10-13 05:52:22] calculate_statistics
[START] [2019-10-13 05:52:22] complete_harvest_instance
[START] [2019-10-13 05:52:22] overall_tsv_creation
[INFO] [2019-10-13 05:52:22] Processing group of 5885 in 1 batches of 10000
[INFO] [2019-10-13 05:53:33] 3086 Traits (unfiltered)...
[INFO] [2019-10-13 05:53:47] 3086 Traits (filtered)...
[INFO] [2019-10-13 05:53:47] 0 Associations (filtered)...
[INFO] [2019-10-13 05:54:30] 15424 metadata added.
[INFO] [2019-10-13 05:54:30] 0 metadata added.
[INFO] [2019-10-13 05:54:30] Average Time: 103.35
[INFO] [2019-10-13 05:54:30] Total Time: 2m8s
[STOP] [2019-10-13 05:54:30] overall_tsv_creation
[INFO] [2019-10-13 05:54:30] Done. Check your files:
[INFO] [2019-10-13 05:54:30] (5885 lines) /app/public/data/greenland_sp_lis/publish_nodes.tsv
[INFO] [2019-10-13 05:54:30] (29649 lines) /app/public/data/greenland_sp_lis/publish_node_ancestors.tsv
[INFO] [2019-10-13 05:54:30] (5885 lines) /app/public/data/greenland_sp_lis/publish_scientific_names.tsv
[INFO] [2019-10-13 05:54:30] (3087 lines) /app/public/data/greenland_sp_lis/publish_traits.tsv
[INFO] [2019-10-13 05:54:30] (15425 lines) /app/public/data/greenland_sp_lis/publish_metadata.tsv
[STOP] [2019-10-13 05:54:31] complete_harvest_instance
[START] [2019-10-13 05:54:31] completed
[STOP] [2019-10-13 05:54:31] completed
[STOP] [2019-10-13 05:54:31] logged process, took 797.3

Latest Process