Stage:
completed
Fetched:
14 Oct 13:29
Validated:
14 Oct 13:29
Deltas Created
14 Oct 13:29
Units Normalized:
14 Oct 13:46
Ancestry Built:
14 Oct 13:32
Nodes Matched:
14 Oct 13:45
Names Parsed:
14 Oct 13:33
New Models Stored:
14 Oct 13:30
Indexed:
14 Oct 13:46
Completed:
14 Oct 13:51
Time to Harvest:
less than a minute
Harvesting Log
(156 lines)
# Logfile created on 2019-10-14 13:29:05 -0400 by logger.rb/56815
[START] [2019-10-14 13:29:05] logged process
[START] [2019-10-14 13:29:05] create_harvest_instance
[STOP] [2019-10-14 13:29:06] create_harvest_instance
[START] [2019-10-14 13:29:06] fetch_files
[STOP] [2019-10-14 13:29:06] fetch_files
[START] [2019-10-14 13:29:06] validate_each_file
[STOP] [2019-10-14 13:29:08] validate_each_file
[START] [2019-10-14 13:29:08] convert_to_csv
[CMD] [2019-10-14 13:29:08] /usr/bin/sort /app/public/converted_csv/nicaragua_sp_lis_refs_16707.csv > /app/public/converted_csv/nicaragua_sp_lis_refs_16707.csv_sorted
[CMD] [2019-10-14 13:29:08] /usr/bin/sort /app/public/converted_csv/nicaragua_sp_lis_nodes_16708.csv > /app/public/converted_csv/nicaragua_sp_lis_nodes_16708.csv_sorted
[CMD] [2019-10-14 13:29:08] /usr/bin/sort /app/public/converted_csv/nicaragua_sp_lis_occurrences_16709.csv > /app/public/converted_csv/nicaragua_sp_lis_occurrences_16709.csv_sorted
[CMD] [2019-10-14 13:29:08] /usr/bin/sort /app/public/converted_csv/nicaragua_sp_lis_measurements_16710.csv > /app/public/converted_csv/nicaragua_sp_lis_measurements_16710.csv_sorted
[STOP] [2019-10-14 13:29:08] convert_to_csv
[START] [2019-10-14 13:29:08] calculate_delta
[CMD] [2019-10-14 13:29:08] echo "0a" > /app/public/diff/nicaragua_sp_lis_refs_16707.diff
[CMD] [2019-10-14 13:29:09] tail -n +1 /app/public/converted_csv/nicaragua_sp_lis_refs_16707.csv >> /app/public/diff/nicaragua_sp_lis_refs_16707.diff
[CMD] [2019-10-14 13:29:09] echo "." >> /app/public/diff/nicaragua_sp_lis_refs_16707.diff
[CMD] [2019-10-14 13:29:09] echo "0a" > /app/public/diff/nicaragua_sp_lis_nodes_16708.diff
[CMD] [2019-10-14 13:29:09] tail -n +1 /app/public/converted_csv/nicaragua_sp_lis_nodes_16708.csv >> /app/public/diff/nicaragua_sp_lis_nodes_16708.diff
[CMD] [2019-10-14 13:29:09] echo "." >> /app/public/diff/nicaragua_sp_lis_nodes_16708.diff
[CMD] [2019-10-14 13:29:09] echo "0a" > /app/public/diff/nicaragua_sp_lis_occurrences_16709.diff
[CMD] [2019-10-14 13:29:09] tail -n +1 /app/public/converted_csv/nicaragua_sp_lis_occurrences_16709.csv >> /app/public/diff/nicaragua_sp_lis_occurrences_16709.diff
[CMD] [2019-10-14 13:29:09] echo "." >> /app/public/diff/nicaragua_sp_lis_occurrences_16709.diff
[CMD] [2019-10-14 13:29:09] echo "0a" > /app/public/diff/nicaragua_sp_lis_measurements_16710.diff
[CMD] [2019-10-14 13:29:09] tail -n +1 /app/public/converted_csv/nicaragua_sp_lis_measurements_16710.csv >> /app/public/diff/nicaragua_sp_lis_measurements_16710.diff
[CMD] [2019-10-14 13:29:10] echo "." >> /app/public/diff/nicaragua_sp_lis_measurements_16710.diff
[STOP] [2019-10-14 13:29:10] calculate_delta
[START] [2019-10-14 13:29:10] parse_diff_and_store
[INFO] [2019-10-14 13:29:10] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-14 13:29:10] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-14 13:29:16] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-14 13:29:18] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-14 13:30:28] Storing 2 References
[INFO] [2019-10-14 13:30:28] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-14 13:30:28] Average Time: 0.0
[INFO] [2019-10-14 13:30:28] Total Time: 1s
[INFO] [2019-10-14 13:30:28] Storing 17246 ScientificNames
[INFO] [2019-10-14 13:30:28] Processing group of 17246 in 18 groups of 1000
[INFO] [2019-10-14 13:30:34] Average Time: 0.381
[INFO] [2019-10-14 13:30:34] Total Time: 7s
[INFO] [2019-10-14 13:30:34] last 3 / first 3: 0.7
[INFO] [2019-10-14 13:30:34] Std.Dev: 0.1224744871391589; Max: 0.73
[INFO] [2019-10-14 13:30:34] Storing 17246 Nodes
[INFO] [2019-10-14 13:30:34] Processing group of 17246 in 18 groups of 1000
[INFO] [2019-10-14 13:30:40] Average Time: 0.278
[INFO] [2019-10-14 13:30:40] Total Time: 6s
[INFO] [2019-10-14 13:30:40] last 3 / first 3: 0.78
[INFO] [2019-10-14 13:30:40] Std.Dev: 0.05477225575051661; Max: 0.34
[INFO] [2019-10-14 13:30:40] Storing 11568 Occurrences
[INFO] [2019-10-14 13:30:40] Processing group of 11568 in 12 groups of 1000
[INFO] [2019-10-14 13:30:41] Average Time: 0.101
[INFO] [2019-10-14 13:30:41] Total Time: 2s
[INFO] [2019-10-14 13:30:41] last 3 / first 3: 0.81
[INFO] [2019-10-14 13:30:41] Std.Dev: 0.0; Max: 0.14
[INFO] [2019-10-14 13:30:41] Storing 24080 TraitsReferences
[INFO] [2019-10-14 13:30:41] Processing group of 24080 in 25 groups of 1000
[INFO] [2019-10-14 13:30:43] Average Time: 0.073
[INFO] [2019-10-14 13:30:43] Total Time: 2s
[INFO] [2019-10-14 13:30:43] last 3 / first 3: 0.54
[INFO] [2019-10-14 13:30:43] Std.Dev: 0.03162277660168379; Max: 0.15
[INFO] [2019-10-14 13:30:43] Storing 24079 Traits
[INFO] [2019-10-14 13:30:43] Processing group of 24079 in 25 groups of 1000
[INFO] [2019-10-14 13:30:50] Average Time: 0.282
[INFO] [2019-10-14 13:30:50] Total Time: 8s
[INFO] [2019-10-14 13:30:50] last 3 / first 3: 0.71
[INFO] [2019-10-14 13:30:50] Std.Dev: 0.05477225575051661; Max: 0.34
[INFO] [2019-10-14 13:30:50] Storing 24071 MetaTraits
[INFO] [2019-10-14 13:30:50] Processing group of 24071 in 25 groups of 1000
[INFO] [2019-10-14 13:30:54] Average Time: 0.143
[INFO] [2019-10-14 13:30:54] Total Time: 4s
[INFO] [2019-10-14 13:30:54] last 3 / first 3: 0.74
[INFO] [2019-10-14 13:30:54] Std.Dev: 0.07745966692414834; Max: 0.37
[STOP] [2019-10-14 13:30:54] parse_diff_and_store
[START] [2019-10-14 13:30:54] resolve_keys
[INFO] [2019-10-14 13:31:53] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-14 13:31:58] traits to occurrences...
[INFO] [2019-10-14 13:32:02] traits to nodes (through occurrences)...
[INFO] [2019-10-14 13:32:02] Traits to sex term...
[INFO] [2019-10-14 13:32:06] Traits to lifestage term...
[INFO] [2019-10-14 13:32:10] MetaTraits to traits...
[INFO] [2019-10-14 13:32:12] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-14 13:32:15] Assocs to occurrences...
[INFO] [2019-10-14 13:32:15] Assocs to nodes...
[INFO] [2019-10-14 13:32:15] Assoc to sex term...
[INFO] [2019-10-14 13:32:15] Assoc to lifestage term...
[STOP] [2019-10-14 13:32:15] resolve_keys
[START] [2019-10-14 13:32:15] hold_for_later_1
[STOP] [2019-10-14 13:32:15] hold_for_later_1
[START] [2019-10-14 13:32:15] hold_for_later_2
[STOP] [2019-10-14 13:32:15] hold_for_later_2
[START] [2019-10-14 13:32:15] resolve_missing_parents
[STOP] [2019-10-14 13:32:50] resolve_missing_parents
[START] [2019-10-14 13:32:50] rebuild_nodes
[START] [2019-10-14 13:32:50] Flattener#flatten
[START] [2019-10-14 13:32:50] Flattener#study_resource
[START] [2019-10-14 13:32:50] Flattener#build_ancestry
[STOP] [2019-10-14 13:32:50] Flattener#build_ancestry
[INFO] [2019-10-14 13:32:50] 17246 ancestry keys
[START] [2019-10-14 13:32:50] build_node_ancestors
[INFO] [2019-10-14 13:32:50] old ancestors deleted.
[STOP] [2019-10-14 13:32:54] build_node_ancestors
[START] [2019-10-14 13:32:57] Flattener#propagate_ancestor_ids
[STOP] [2019-10-14 13:32:58] Flattener#propagate_ancestor_ids
[STOP] [2019-10-14 13:32:58] Flattener#flatten
[STOP] [2019-10-14 13:32:58] rebuild_nodes
[START] [2019-10-14 13:32:58] resolve_missing_media_owners
[STOP] [2019-10-14 13:32:58] resolve_missing_media_owners
[START] [2019-10-14 13:32:58] sanitize_media_verbatims
[STOP] [2019-10-14 13:32:58] sanitize_media_verbatims
[START] [2019-10-14 13:32:58] queue_downloads
[STOP] [2019-10-14 13:32:58] queue_downloads
[START] [2019-10-14 13:32:58] parse_names
[WARN] [2019-10-14 13:32:58] I see 17246 names which still need to be parsed.
[STOP] [2019-10-14 13:33:12] parse_names
[START] [2019-10-14 13:33:12] denormalize_canonical_names_to_nodes
[STOP] [2019-10-14 13:33:12] denormalize_canonical_names_to_nodes
[START] [2019-10-14 13:33:12] match_nodes
[START] [2019-10-14 13:33:12] map_all_nodes_to_pages
[STOP] [2019-10-14 13:45:44] map_all_nodes_to_pages
[INFO] [2019-10-14 13:45:44] 969 Unmatched nodes (of 17246)! That's too many to output. First 10: Balclutha lucida (#50924051); Balclutha rosea (#50924109); Balclutha saltuella (#50924449); Balclutha hebe (#50936827); Proba atratus (#50934731); Paracarniella azteca (#50936903); Hypselonotus proximus (#50940729); Leptoglossus (#50932373); Sirthenea stria (#50933882); Triatoma dimidiata (#50937485)
[START] [2019-10-14 13:45:44] update_nodes
[STOP] [2019-10-14 13:45:50] update_nodes
[STOP] [2019-10-14 13:45:50] match_nodes
[START] [2019-10-14 13:45:50] reindex_search
[STOP] [2019-10-14 13:46:26] reindex_search
[START] [2019-10-14 13:46:26] normalize_units
[STOP] [2019-10-14 13:46:26] normalize_units
[START] [2019-10-14 13:46:26] calculate_statistics
[STOP] [2019-10-14 13:46:26] calculate_statistics
[START] [2019-10-14 13:46:26] complete_harvest_instance
[START] [2019-10-14 13:46:26] overall_tsv_creation
[INFO] [2019-10-14 13:46:26] Processing group of 17246 in 2 batches of 10000
[INFO] [2019-10-14 13:47:54] 6686 Traits (unfiltered)...
[INFO] [2019-10-14 13:48:08] 6686 Traits (filtered)...
[INFO] [2019-10-14 13:48:08] 0 Associations (filtered)...
[INFO] [2019-10-14 13:48:59] 33429 metadata added.
[INFO] [2019-10-14 13:48:59] 0 metadata added.
[INFO] [2019-10-14 13:50:17] 4882 Traits (unfiltered)...
[INFO] [2019-10-14 13:50:31] 4882 Traits (filtered)...
[INFO] [2019-10-14 13:50:31] 0 Associations (filtered)...
[INFO] [2019-10-14 13:51:19] 24404 metadata added.
[INFO] [2019-10-14 13:51:19] 0 metadata added.
[INFO] [2019-10-14 13:51:19] Average Time: 120.645
[INFO] [2019-10-14 13:51:19] Total Time: 4m53s
[STOP] [2019-10-14 13:51:19] overall_tsv_creation
[INFO] [2019-10-14 13:51:19] Done. Check your files:
[INFO] [2019-10-14 13:51:19] (17246 lines) /app/public/data/nicaragua_sp_lis/publish_nodes.tsv
[INFO] [2019-10-14 13:51:19] (44195 lines) /app/public/data/nicaragua_sp_lis/publish_node_ancestors.tsv
[INFO] [2019-10-14 13:51:19] (17246 lines) /app/public/data/nicaragua_sp_lis/publish_scientific_names.tsv
[INFO] [2019-10-14 13:51:19] (11569 lines) /app/public/data/nicaragua_sp_lis/publish_traits.tsv
[INFO] [2019-10-14 13:51:19] (57834 lines) /app/public/data/nicaragua_sp_lis/publish_metadata.tsv
[STOP] [2019-10-14 13:51:20] complete_harvest_instance
[START] [2019-10-14 13:51:20] completed
[STOP] [2019-10-14 13:51:20] completed
[STOP] [2019-10-14 13:51:20] logged process, took 1334.59
Latest Process