Harvest for United Kingdom Species List Created 16 Oct 20:30

Stage: completed
Fetched: 16 Oct 20:30
Validated: 16 Oct 20:30
Deltas Created 16 Oct 20:30
Units Normalized: 16 Oct 23:07
Ancestry Built: 16 Oct 20:43
Nodes Matched: 16 Oct 23:03
Names Parsed: 16 Oct 20:44
New Models Stored: 16 Oct 20:37
Indexed: 16 Oct 23:07
Completed: 16 Oct 23:27
Time to Harvest: 3 minutes

Expected File Format Definitions

Harvesting Log (most recent first)

# Logfile created on 2019-10-16 20:30:24 -0400 by logger.rb/56815
[START] [2019-10-16 20:30:24] logged process
[START] [2019-10-16 20:30:24] create_harvest_instance
[STOP] [2019-10-16 20:30:24] create_harvest_instance
[START] [2019-10-16 20:30:24] fetch_files
[STOP] [2019-10-16 20:30:24] fetch_files
[START] [2019-10-16 20:30:24] validate_each_file
[STOP] [2019-10-16 20:30:32] validate_each_file
[START] [2019-10-16 20:30:32] convert_to_csv
[CMD] [2019-10-16 20:30:32] /usr/bin/sort /app/public/converted_csv/u_kingdom_sp_lis_refs_17625.csv > /app/public/converted_csv/u_kingdom_sp_lis_refs_17625.csv_sorted
[CMD] [2019-10-16 20:30:33] /usr/bin/sort /app/public/converted_csv/u_kingdom_sp_lis_nodes_17626.csv > /app/public/converted_csv/u_kingdom_sp_lis_nodes_17626.csv_sorted
[CMD] [2019-10-16 20:30:33] /usr/bin/sort /app/public/converted_csv/u_kingdom_sp_lis_occurrences_17627.csv > /app/public/converted_csv/u_kingdom_sp_lis_occurrences_17627.csv_sorted
[CMD] [2019-10-16 20:30:33] /usr/bin/sort /app/public/converted_csv/u_kingdom_sp_lis_measurements_17628.csv > /app/public/converted_csv/u_kingdom_sp_lis_measurements_17628.csv_sorted
[STOP] [2019-10-16 20:30:33] convert_to_csv
[START] [2019-10-16 20:30:33] calculate_delta
[CMD] [2019-10-16 20:30:33] echo "0a" > /app/public/diff/u_kingdom_sp_lis_refs_17625.diff
[CMD] [2019-10-16 20:30:33] tail -n +1 /app/public/converted_csv/u_kingdom_sp_lis_refs_17625.csv >> /app/public/diff/u_kingdom_sp_lis_refs_17625.diff
[CMD] [2019-10-16 20:30:33] echo "." >> /app/public/diff/u_kingdom_sp_lis_refs_17625.diff
[CMD] [2019-10-16 20:30:33] echo "0a" > /app/public/diff/u_kingdom_sp_lis_nodes_17626.diff
[CMD] [2019-10-16 20:30:33] tail -n +1 /app/public/converted_csv/u_kingdom_sp_lis_nodes_17626.csv >> /app/public/diff/u_kingdom_sp_lis_nodes_17626.diff
[CMD] [2019-10-16 20:30:33] echo "." >> /app/public/diff/u_kingdom_sp_lis_nodes_17626.diff
[CMD] [2019-10-16 20:30:33] echo "0a" > /app/public/diff/u_kingdom_sp_lis_occurrences_17627.diff
[CMD] [2019-10-16 20:30:33] tail -n +1 /app/public/converted_csv/u_kingdom_sp_lis_occurrences_17627.csv >> /app/public/diff/u_kingdom_sp_lis_occurrences_17627.diff
[CMD] [2019-10-16 20:30:33] echo "." >> /app/public/diff/u_kingdom_sp_lis_occurrences_17627.diff
[CMD] [2019-10-16 20:30:33] echo "0a" > /app/public/diff/u_kingdom_sp_lis_measurements_17628.diff
[CMD] [2019-10-16 20:30:33] tail -n +1 /app/public/converted_csv/u_kingdom_sp_lis_measurements_17628.csv >> /app/public/diff/u_kingdom_sp_lis_measurements_17628.diff
[CMD] [2019-10-16 20:30:33] echo "." >> /app/public/diff/u_kingdom_sp_lis_measurements_17628.diff
[STOP] [2019-10-16 20:30:33] calculate_delta
[START] [2019-10-16 20:30:33] parse_diff_and_store
[INFO] [2019-10-16 20:30:33] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-16 20:30:34] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-16 20:30:59] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-16 20:31:05] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-16 20:35:45] Storing 2 References
[INFO] [2019-10-16 20:35:45] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-16 20:35:45] Average Time: 0.0
[INFO] [2019-10-16 20:35:45] Total Time: 1s
[INFO] [2019-10-16 20:35:45] Storing 70665 ScientificNames
[INFO] [2019-10-16 20:35:45] Processing group of 70665 in 71 groups of 1000
[INFO] [2019-10-16 20:36:17] Average Time: 0.446
[INFO] [2019-10-16 20:36:17] Total Time: 33s
[INFO] [2019-10-16 20:36:17] last 3 / first 3: 0.45
[INFO] [2019-10-16 20:36:17] Std.Dev: 0.34351128074635334; Max: 2.64
[INFO] [2019-10-16 20:36:17] Storing 70665 Nodes
[INFO] [2019-10-16 20:36:17] Processing group of 70665 in 71 groups of 1000
[INFO] [2019-10-16 20:36:44] Average Time: 0.374
[INFO] [2019-10-16 20:36:44] Total Time: 27s
[INFO] [2019-10-16 20:36:44] last 3 / first 3: 1.12
[INFO] [2019-10-16 20:36:44] Std.Dev: 0.3646916505762094; Max: 2.93
[INFO] [2019-10-16 20:36:44] Storing 49621 Occurrences
[INFO] [2019-10-16 20:36:44] Processing group of 49621 in 50 groups of 1000
[INFO] [2019-10-16 20:36:50] Average Time: 0.11
[INFO] [2019-10-16 20:36:50] Total Time: 6s
[INFO] [2019-10-16 20:36:50] last 3 / first 3: 1.24
[INFO] [2019-10-16 20:36:50] Std.Dev: 0.0; Max: 0.22
[INFO] [2019-10-16 20:36:50] Storing 99242 TraitsReferences
[INFO] [2019-10-16 20:36:50] Processing group of 99242 in 100 groups of 1000
[INFO] [2019-10-16 20:37:03] Average Time: 0.123
[INFO] [2019-10-16 20:37:03] Total Time: 13s
[INFO] [2019-10-16 20:37:03] last 3 / first 3: 0.52
[INFO] [2019-10-16 20:37:03] Std.Dev: 0.3065941943351178; Max: 2.92
[INFO] [2019-10-16 20:37:03] Storing 99242 Traits
[INFO] [2019-10-16 20:37:03] Processing group of 99242 in 100 groups of 1000
[INFO] [2019-10-16 20:37:38] Average Time: 0.349
[INFO] [2019-10-16 20:37:38] Total Time: 36s
[INFO] [2019-10-16 20:37:38] last 3 / first 3: 0.72
[INFO] [2019-10-16 20:37:38] Std.Dev: 0.4171330722922842; Max: 3.43
[INFO] [2019-10-16 20:37:38] Storing 99098 MetaTraits
[INFO] [2019-10-16 20:37:38] Processing group of 99098 in 100 groups of 1000
[INFO] [2019-10-16 20:37:56] Average Time: 0.17
[INFO] [2019-10-16 20:37:56] Total Time: 18s
[INFO] [2019-10-16 20:37:56] last 3 / first 3: 0.58
[INFO] [2019-10-16 20:37:56] Std.Dev: 0.4289522117905443; Max: 3.43
[STOP] [2019-10-16 20:37:56] parse_diff_and_store
[START] [2019-10-16 20:37:56] resolve_keys
[INFO] [2019-10-16 20:40:15] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-16 20:40:25] traits to occurrences...
[INFO] [2019-10-16 20:40:33] traits to nodes (through occurrences)...
[INFO] [2019-10-16 20:40:34] Traits to sex term...
[INFO] [2019-10-16 20:40:41] Traits to lifestage term...
[INFO] [2019-10-16 20:40:49] MetaTraits to traits...
[INFO] [2019-10-16 20:40:55] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-16 20:41:09] Assocs to occurrences...
[INFO] [2019-10-16 20:41:09] Assocs to nodes...
[INFO] [2019-10-16 20:41:09] Assoc to sex term...
[INFO] [2019-10-16 20:41:09] Assoc to lifestage term...
[STOP] [2019-10-16 20:41:09] resolve_keys
[START] [2019-10-16 20:41:09] hold_for_later_1
[STOP] [2019-10-16 20:41:09] hold_for_later_1
[START] [2019-10-16 20:41:09] hold_for_later_2
[STOP] [2019-10-16 20:41:09] hold_for_later_2
[START] [2019-10-16 20:41:09] resolve_missing_parents
[STOP] [2019-10-16 20:42:33] resolve_missing_parents
[START] [2019-10-16 20:42:33] rebuild_nodes
[START] [2019-10-16 20:42:33] Flattener#flatten
[START] [2019-10-16 20:42:33] Flattener#study_resource
[START] [2019-10-16 20:42:33] Flattener#build_ancestry
[STOP] [2019-10-16 20:42:45] Flattener#build_ancestry
[INFO] [2019-10-16 20:42:45] 70665 ancestry keys
[START] [2019-10-16 20:42:45] build_node_ancestors
[INFO] [2019-10-16 20:42:45] old ancestors deleted.
[STOP] [2019-10-16 20:43:34] build_node_ancestors
[START] [2019-10-16 20:43:40] Flattener#propagate_ancestor_ids
[STOP] [2019-10-16 20:43:51] Flattener#propagate_ancestor_ids
[STOP] [2019-10-16 20:43:51] Flattener#flatten
[STOP] [2019-10-16 20:43:51] rebuild_nodes
[START] [2019-10-16 20:43:51] resolve_missing_media_owners
[STOP] [2019-10-16 20:43:51] resolve_missing_media_owners
[START] [2019-10-16 20:43:51] sanitize_media_verbatims
[STOP] [2019-10-16 20:43:51] sanitize_media_verbatims
[START] [2019-10-16 20:43:51] queue_downloads
[STOP] [2019-10-16 20:43:51] queue_downloads
[START] [2019-10-16 20:43:51] parse_names
[WARN] [2019-10-16 20:43:51] I see 70665 names which still need to be parsed.
[STOP] [2019-10-16 20:44:44] parse_names
[START] [2019-10-16 20:44:44] denormalize_canonical_names_to_nodes
[STOP] [2019-10-16 20:44:45] denormalize_canonical_names_to_nodes
[START] [2019-10-16 20:44:45] match_nodes
[START] [2019-10-16 20:44:45] map_all_nodes_to_pages
[STOP] [2019-10-16 23:03:08] map_all_nodes_to_pages
[INFO] [2019-10-16 23:03:08] 7834 Unmatched nodes (of 70665)! That's too many to output. First 10: Magnoliophyta (#52561646); Magnoliopsida (#52561645); Betula excelsa (#52624206); Alnus viridis (#52594205); Quercus borealis (#52614230); Quercus bambusifolia (#52623130); Morella bojeriana (#52610512); Juglandicarya lubbockii (#52619676); Parietaria diffusa (#52622953); Rubus pyramidalis (#52572722)
[START] [2019-10-16 23:03:08] update_nodes
[STOP] [2019-10-16 23:03:32] update_nodes
[STOP] [2019-10-16 23:03:32] match_nodes
[START] [2019-10-16 23:03:32] reindex_search
[STOP] [2019-10-16 23:07:17] reindex_search
[START] [2019-10-16 23:07:17] normalize_units
[STOP] [2019-10-16 23:07:47] normalize_units
[START] [2019-10-16 23:07:47] calculate_statistics
[STOP] [2019-10-16 23:07:47] calculate_statistics
[START] [2019-10-16 23:07:47] complete_harvest_instance
[START] [2019-10-16 23:07:48] overall_tsv_creation
[INFO] [2019-10-16 23:07:48] Processing group of 70665 in 8 batches of 10000
[INFO] [2019-10-16 23:09:14] 5729 Traits (unfiltered)...
[INFO] [2019-10-16 23:09:27] 5729 Traits (filtered)...
[INFO] [2019-10-16 23:09:27] 0 Associations (filtered)...
[INFO] [2019-10-16 23:10:15] 28638 metadata added.
[INFO] [2019-10-16 23:10:15] 0 metadata added.
[INFO] [2019-10-16 23:11:45] 6715 Traits (unfiltered)...
[INFO] [2019-10-16 23:11:58] 6715 Traits (filtered)...
[INFO] [2019-10-16 23:11:58] 0 Associations (filtered)...
[INFO] [2019-10-16 23:12:48] 33570 metadata added.
[INFO] [2019-10-16 23:12:48] 0 metadata added.
[INFO] [2019-10-16 23:14:21] 7198 Traits (unfiltered)...
[INFO] [2019-10-16 23:14:34] 7198 Traits (filtered)...
[INFO] [2019-10-16 23:14:35] 0 Associations (filtered)...
[INFO] [2019-10-16 23:15:27] 35984 metadata added.
[INFO] [2019-10-16 23:15:27] 0 metadata added.
[INFO] [2019-10-16 23:17:02] 7425 Traits (unfiltered)...
[INFO] [2019-10-16 23:17:15] 7425 Traits (filtered)...
[INFO] [2019-10-16 23:17:15] 0 Associations (filtered)...
[INFO] [2019-10-16 23:18:09] 37110 metadata added.
[INFO] [2019-10-16 23:18:09] 0 metadata added.
[INFO] [2019-10-16 23:19:41] 7418 Traits (unfiltered)...
[INFO] [2019-10-16 23:19:54] 7418 Traits (filtered)...
[INFO] [2019-10-16 23:19:54] 0 Associations (filtered)...
[INFO] [2019-10-16 23:20:51] 37058 metadata added.
[INFO] [2019-10-16 23:20:51] 0 metadata added.
[INFO] [2019-10-16 23:22:24] 7191 Traits (unfiltered)...
[INFO] [2019-10-16 23:22:37] 7191 Traits (filtered)...
[INFO] [2019-10-16 23:22:37] 0 Associations (filtered)...
[INFO] [2019-10-16 23:23:30] 35924 metadata added.
[INFO] [2019-10-16 23:23:30] 0 metadata added.
[INFO] [2019-10-16 23:25:03] 7439 Traits (unfiltered)...
[INFO] [2019-10-16 23:25:16] 7439 Traits (filtered)...
[INFO] [2019-10-16 23:25:16] 0 Associations (filtered)...
[INFO] [2019-10-16 23:26:13] 37154 metadata added.
[INFO] [2019-10-16 23:26:13] 0 metadata added.
[INFO] [2019-10-16 23:26:58] 506 Traits (unfiltered)...
[INFO] [2019-10-16 23:27:11] 506 Traits (filtered)...
[INFO] [2019-10-16 23:27:11] 0 Associations (filtered)...
[INFO] [2019-10-16 23:27:49] 2523 metadata added.
[INFO] [2019-10-16 23:27:49] 0 metadata added.
[INFO] [2019-10-16 23:27:49] Average Time: 122.633
[INFO] [2019-10-16 23:27:49] Total Time: 20m2s
[INFO] [2019-10-16 23:27:49] last 3 / first 3: 0.9
[INFO] [2019-10-16 23:27:49] Std.Dev: 19.53967758178215; Max: 133.39
[STOP] [2019-10-16 23:27:49] overall_tsv_creation
[INFO] [2019-10-16 23:27:49] Done. Check your files:
[INFO] [2019-10-16 23:27:49] (70665 lines) /app/public/data/u_kingdom_sp_lis/publish_nodes.tsv
[INFO] [2019-10-16 23:27:49] (386432 lines) /app/public/data/u_kingdom_sp_lis/publish_node_ancestors.tsv
[INFO] [2019-10-16 23:27:49] (70665 lines) /app/public/data/u_kingdom_sp_lis/publish_scientific_names.tsv
[INFO] [2019-10-16 23:27:49] (49622 lines) /app/public/data/u_kingdom_sp_lis/publish_traits.tsv
[INFO] [2019-10-16 23:27:49] (247962 lines) /app/public/data/u_kingdom_sp_lis/publish_metadata.tsv
[STOP] [2019-10-16 23:27:50] complete_harvest_instance
[START] [2019-10-16 23:27:50] completed
[STOP] [2019-10-16 23:27:50] completed
[STOP] [2019-10-16 23:27:50] logged process, took 10645.85

Latest Process