Stage:
completed
Fetched:
12 Oct 05:46
Validated:
12 Oct 05:46
Deltas Created
12 Oct 05:46
Units Normalized:
12 Oct 06:43
Ancestry Built:
12 Oct 05:59
Nodes Matched:
12 Oct 06:41
Names Parsed:
12 Oct 06:00
New Models Stored:
12 Oct 05:53
Indexed:
12 Oct 06:43
Completed:
12 Oct 06:59
Time to Harvest:
1 minute
Harvesting Log
(176 lines)
# Logfile created on 2019-10-12 05:46:32 -0400 by logger.rb/56815
[START] [2019-10-12 05:46:32] logged process
[START] [2019-10-12 05:46:32] create_harvest_instance
[STOP] [2019-10-12 05:46:33] create_harvest_instance
[START] [2019-10-12 05:46:33] fetch_files
[STOP] [2019-10-12 05:46:33] fetch_files
[START] [2019-10-12 05:46:33] validate_each_file
[STOP] [2019-10-12 05:46:41] validate_each_file
[START] [2019-10-12 05:46:41] convert_to_csv
[CMD] [2019-10-12 05:46:41] /usr/bin/sort /app/public/converted_csv/colombia_sp_list_refs_15451.csv > /app/public/converted_csv/colombia_sp_list_refs_15451.csv_sorted
[CMD] [2019-10-12 05:46:41] /usr/bin/sort /app/public/converted_csv/colombia_sp_list_nodes_15452.csv > /app/public/converted_csv/colombia_sp_list_nodes_15452.csv_sorted
[CMD] [2019-10-12 05:46:42] /usr/bin/sort /app/public/converted_csv/colombia_sp_list_occurrences_15453.csv > /app/public/converted_csv/colombia_sp_list_occurrences_15453.csv_sorted
[CMD] [2019-10-12 05:46:42] /usr/bin/sort /app/public/converted_csv/colombia_sp_list_measurements_15454.csv > /app/public/converted_csv/colombia_sp_list_measurements_15454.csv_sorted
[STOP] [2019-10-12 05:46:42] convert_to_csv
[START] [2019-10-12 05:46:42] calculate_delta
[CMD] [2019-10-12 05:46:42] echo "0a" > /app/public/diff/colombia_sp_list_refs_15451.diff
[CMD] [2019-10-12 05:46:42] tail -n +1 /app/public/converted_csv/colombia_sp_list_refs_15451.csv >> /app/public/diff/colombia_sp_list_refs_15451.diff
[CMD] [2019-10-12 05:46:42] echo "." >> /app/public/diff/colombia_sp_list_refs_15451.diff
[CMD] [2019-10-12 05:46:42] echo "0a" > /app/public/diff/colombia_sp_list_nodes_15452.diff
[CMD] [2019-10-12 05:46:42] tail -n +1 /app/public/converted_csv/colombia_sp_list_nodes_15452.csv >> /app/public/diff/colombia_sp_list_nodes_15452.diff
[CMD] [2019-10-12 05:46:42] echo "." >> /app/public/diff/colombia_sp_list_nodes_15452.diff
[CMD] [2019-10-12 05:46:43] echo "0a" > /app/public/diff/colombia_sp_list_occurrences_15453.diff
[CMD] [2019-10-12 05:46:43] tail -n +1 /app/public/converted_csv/colombia_sp_list_occurrences_15453.csv >> /app/public/diff/colombia_sp_list_occurrences_15453.diff
[CMD] [2019-10-12 05:46:43] echo "." >> /app/public/diff/colombia_sp_list_occurrences_15453.diff
[CMD] [2019-10-12 05:46:43] echo "0a" > /app/public/diff/colombia_sp_list_measurements_15454.diff
[CMD] [2019-10-12 05:46:43] tail -n +1 /app/public/converted_csv/colombia_sp_list_measurements_15454.csv >> /app/public/diff/colombia_sp_list_measurements_15454.diff
[CMD] [2019-10-12 05:46:43] echo "." >> /app/public/diff/colombia_sp_list_measurements_15454.diff
[STOP] [2019-10-12 05:46:43] calculate_delta
[START] [2019-10-12 05:46:43] parse_diff_and_store
[INFO] [2019-10-12 05:46:43] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-12 05:46:43] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-12 05:47:06] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-12 05:47:12] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-12 05:51:48] Storing 2 References
[INFO] [2019-10-12 05:51:48] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-12 05:51:48] Average Time: 0.0
[INFO] [2019-10-12 05:51:48] Total Time: 1s
[INFO] [2019-10-12 05:51:48] Storing 57627 ScientificNames
[INFO] [2019-10-12 05:51:48] Processing group of 57627 in 58 groups of 1000
[INFO] [2019-10-12 05:52:14] Average Time: 0.434
[INFO] [2019-10-12 05:52:14] Total Time: 26s
[INFO] [2019-10-12 05:52:14] last 3 / first 3: 0.84
[INFO] [2019-10-12 05:52:14] Std.Dev: 0.29154759474226505; Max: 2.4
[INFO] [2019-10-12 05:52:14] Storing 57627 Nodes
[INFO] [2019-10-12 05:52:14] Processing group of 57627 in 58 groups of 1000
[INFO] [2019-10-12 05:52:37] Average Time: 0.392
[INFO] [2019-10-12 05:52:37] Total Time: 24s
[INFO] [2019-10-12 05:52:37] last 3 / first 3: 0.92
[INFO] [2019-10-12 05:52:37] Std.Dev: 0.3714835124201342; Max: 2.62
[INFO] [2019-10-12 05:52:37] Storing 46272 Occurrences
[INFO] [2019-10-12 05:52:37] Processing group of 46272 in 47 groups of 1000
[INFO] [2019-10-12 05:52:42] Average Time: 0.106
[INFO] [2019-10-12 05:52:42] Total Time: 6s
[INFO] [2019-10-12 05:52:42] last 3 / first 3: 0.8
[INFO] [2019-10-12 05:52:42] Std.Dev: 0.0; Max: 0.22
[INFO] [2019-10-12 05:52:42] Storing 92584 TraitsReferences
[INFO] [2019-10-12 05:52:42] Processing group of 92584 in 93 groups of 1000
[INFO] [2019-10-12 05:52:54] Average Time: 0.125
[INFO] [2019-10-12 05:52:54] Total Time: 13s
[INFO] [2019-10-12 05:52:54] last 3 / first 3: 0.72
[INFO] [2019-10-12 05:52:54] Std.Dev: 0.3082207001484488; Max: 2.62
[INFO] [2019-10-12 05:52:54] Storing 92583 Traits
[INFO] [2019-10-12 05:52:54] Processing group of 92583 in 93 groups of 1000
[INFO] [2019-10-12 05:53:37] Average Time: 0.462
[INFO] [2019-10-12 05:53:37] Total Time: 44s
[INFO] [2019-10-12 05:53:37] last 3 / first 3: 3.23
[INFO] [2019-10-12 05:53:37] Std.Dev: 0.5665686189686118; Max: 3.19
[INFO] [2019-10-12 05:53:37] Storing 92513 MetaTraits
[INFO] [2019-10-12 05:53:37] Processing group of 92513 in 93 groups of 1000
[INFO] [2019-10-12 05:53:58] Average Time: 0.221
[INFO] [2019-10-12 05:53:58] Total Time: 21s
[INFO] [2019-10-12 05:53:58] last 3 / first 3: 6.65
[INFO] [2019-10-12 05:53:58] Std.Dev: 0.43703546766824314; Max: 3.19
[STOP] [2019-10-12 05:53:58] parse_diff_and_store
[START] [2019-10-12 05:53:58] resolve_keys
[INFO] [2019-10-12 05:56:33] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-12 05:56:42] traits to occurrences...
[INFO] [2019-10-12 05:56:51] traits to nodes (through occurrences)...
[INFO] [2019-10-12 05:56:52] Traits to sex term...
[INFO] [2019-10-12 05:56:58] Traits to lifestage term...
[INFO] [2019-10-12 05:57:05] MetaTraits to traits...
[INFO] [2019-10-12 05:57:11] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-12 05:57:24] Assocs to occurrences...
[INFO] [2019-10-12 05:57:24] Assocs to nodes...
[INFO] [2019-10-12 05:57:24] Assoc to sex term...
[INFO] [2019-10-12 05:57:24] Assoc to lifestage term...
[STOP] [2019-10-12 05:57:24] resolve_keys
[START] [2019-10-12 05:57:24] hold_for_later_1
[STOP] [2019-10-12 05:57:24] hold_for_later_1
[START] [2019-10-12 05:57:24] hold_for_later_2
[STOP] [2019-10-12 05:57:24] hold_for_later_2
[START] [2019-10-12 05:57:24] resolve_missing_parents
[STOP] [2019-10-12 05:58:58] resolve_missing_parents
[START] [2019-10-12 05:58:58] rebuild_nodes
[START] [2019-10-12 05:58:58] Flattener#flatten
[START] [2019-10-12 05:58:58] Flattener#study_resource
[START] [2019-10-12 05:58:58] Flattener#build_ancestry
[STOP] [2019-10-12 05:59:08] Flattener#build_ancestry
[INFO] [2019-10-12 05:59:08] 57627 ancestry keys
[START] [2019-10-12 05:59:08] build_node_ancestors
[INFO] [2019-10-12 05:59:08] old ancestors deleted.
[STOP] [2019-10-12 05:59:43] build_node_ancestors
[START] [2019-10-12 05:59:48] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 05:59:55] Flattener#propagate_ancestor_ids
[STOP] [2019-10-12 05:59:55] Flattener#flatten
[STOP] [2019-10-12 05:59:55] rebuild_nodes
[START] [2019-10-12 05:59:55] resolve_missing_media_owners
[STOP] [2019-10-12 05:59:55] resolve_missing_media_owners
[START] [2019-10-12 05:59:55] sanitize_media_verbatims
[STOP] [2019-10-12 05:59:55] sanitize_media_verbatims
[START] [2019-10-12 05:59:55] queue_downloads
[STOP] [2019-10-12 05:59:55] queue_downloads
[START] [2019-10-12 05:59:55] parse_names
[WARN] [2019-10-12 05:59:55] I see 57627 names which still need to be parsed.
[STOP] [2019-10-12 06:00:40] parse_names
[START] [2019-10-12 06:00:40] denormalize_canonical_names_to_nodes
[STOP] [2019-10-12 06:00:41] denormalize_canonical_names_to_nodes
[START] [2019-10-12 06:00:41] match_nodes
[START] [2019-10-12 06:00:41] map_all_nodes_to_pages
[STOP] [2019-10-12 06:40:39] map_all_nodes_to_pages
[INFO] [2019-10-12 06:40:39] 3689 Unmatched nodes (of 57627)! That's too many to output. First 10: Magnoliophyta (#49209973); Magnoliopsida (#49209972); Faramea parvibractea (#49227478); Faramea orinocensis (#49246577); Palicourea macarthurorum (#49219239); Palicourea dimorphandrioides (#49244960); Palicourea spectabilis (#49245567); Palicourea anacardifolia (#49247770); Palicourea discolor (#49263252); Palicourea mombachensis (#49265966)
[START] [2019-10-12 06:40:39] update_nodes
[STOP] [2019-10-12 06:41:00] update_nodes
[STOP] [2019-10-12 06:41:00] match_nodes
[START] [2019-10-12 06:41:00] reindex_search
[STOP] [2019-10-12 06:43:10] reindex_search
[START] [2019-10-12 06:43:10] normalize_units
[STOP] [2019-10-12 06:43:10] normalize_units
[START] [2019-10-12 06:43:10] calculate_statistics
[STOP] [2019-10-12 06:43:10] calculate_statistics
[START] [2019-10-12 06:43:10] complete_harvest_instance
[START] [2019-10-12 06:43:10] overall_tsv_creation
[INFO] [2019-10-12 06:43:10] Processing group of 57627 in 6 batches of 10000
[INFO] [2019-10-12 06:44:39] 6715 Traits (unfiltered)...
[INFO] [2019-10-12 06:44:53] 6715 Traits (filtered)...
[INFO] [2019-10-12 06:44:53] 0 Associations (filtered)...
[INFO] [2019-10-12 06:45:44] 33575 metadata added.
[INFO] [2019-10-12 06:45:44] 0 metadata added.
[INFO] [2019-10-12 06:47:17] 7989 Traits (unfiltered)...
[INFO] [2019-10-12 06:47:31] 7989 Traits (filtered)...
[INFO] [2019-10-12 06:47:31] 0 Associations (filtered)...
[INFO] [2019-10-12 06:48:31] 39940 metadata added.
[INFO] [2019-10-12 06:48:31] 0 metadata added.
[INFO] [2019-10-12 06:50:06] 8263 Traits (unfiltered)...
[INFO] [2019-10-12 06:50:20] 8263 Traits (filtered)...
[INFO] [2019-10-12 06:50:20] 0 Associations (filtered)...
[INFO] [2019-10-12 06:51:17] 41304 metadata added.
[INFO] [2019-10-12 06:51:17] 0 metadata added.
[INFO] [2019-10-12 06:52:50] 8344 Traits (unfiltered)...
[INFO] [2019-10-12 06:53:04] 8344 Traits (filtered)...
[INFO] [2019-10-12 06:53:04] 0 Associations (filtered)...
[INFO] [2019-10-12 06:54:01] 41708 metadata added.
[INFO] [2019-10-12 06:54:01] 0 metadata added.
[INFO] [2019-10-12 06:55:37] 8304 Traits (unfiltered)...
[INFO] [2019-10-12 06:55:50] 8304 Traits (filtered)...
[INFO] [2019-10-12 06:55:51] 0 Associations (filtered)...
[INFO] [2019-10-12 06:56:50] 41501 metadata added.
[INFO] [2019-10-12 06:56:50] 0 metadata added.
[INFO] [2019-10-12 06:58:12] 6657 Traits (unfiltered)...
[INFO] [2019-10-12 06:58:25] 6657 Traits (filtered)...
[INFO] [2019-10-12 06:58:26] 0 Associations (filtered)...
[INFO] [2019-10-12 06:59:21] 33261 metadata added.
[INFO] [2019-10-12 06:59:21] 0 metadata added.
[INFO] [2019-10-12 06:59:21] Average Time: 135.103
[INFO] [2019-10-12 06:59:21] Total Time: 16m11s
[STOP] [2019-10-12 06:59:21] overall_tsv_creation
[INFO] [2019-10-12 06:59:21] Done. Check your files:
[INFO] [2019-10-12 06:59:21] (57627 lines) /app/public/data/colombia_sp_list/publish_nodes.tsv
[INFO] [2019-10-12 06:59:21] (271762 lines) /app/public/data/colombia_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-12 06:59:21] (57627 lines) /app/public/data/colombia_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-12 06:59:21] (46273 lines) /app/public/data/colombia_sp_list/publish_traits.tsv
[INFO] [2019-10-12 06:59:21] (231290 lines) /app/public/data/colombia_sp_list/publish_metadata.tsv
[STOP] [2019-10-12 06:59:22] complete_harvest_instance
[START] [2019-10-12 06:59:22] completed
[STOP] [2019-10-12 06:59:22] completed
[STOP] [2019-10-12 06:59:22] logged process, took 4369.3
Latest Process