Stage:
completed
Fetched:
11 Oct 12:57
Validated:
11 Oct 12:57
Deltas Created
11 Oct 12:57
Units Normalized:
11 Oct 13:03
Ancestry Built:
11 Oct 12:59
Nodes Matched:
11 Oct 13:03
Names Parsed:
11 Oct 12:59
New Models Stored:
11 Oct 12:58
Indexed:
11 Oct 13:03
Completed:
11 Oct 13:06
Time to Harvest:
less than a minute
Harvesting Log
(149 lines)
# Logfile created on 2019-10-11 12:57:17 -0400 by logger.rb/56815
[START] [2019-10-11 12:57:17] logged process
[START] [2019-10-11 12:57:17] create_harvest_instance
[STOP] [2019-10-11 12:57:18] create_harvest_instance
[START] [2019-10-11 12:57:18] fetch_files
[STOP] [2019-10-11 12:57:18] fetch_files
[START] [2019-10-11 12:57:18] validate_each_file
[STOP] [2019-10-11 12:57:19] validate_each_file
[START] [2019-10-11 12:57:19] convert_to_csv
[CMD] [2019-10-11 12:57:19] /usr/bin/sort /app/public/converted_csv/benin_sp_list_refs_15124.csv > /app/public/converted_csv/benin_sp_list_refs_15124.csv_sorted
[CMD] [2019-10-11 12:57:19] /usr/bin/sort /app/public/converted_csv/benin_sp_list_nodes_15125.csv > /app/public/converted_csv/benin_sp_list_nodes_15125.csv_sorted
[CMD] [2019-10-11 12:57:19] /usr/bin/sort /app/public/converted_csv/benin_sp_list_occurrences_15126.csv > /app/public/converted_csv/benin_sp_list_occurrences_15126.csv_sorted
[CMD] [2019-10-11 12:57:19] /usr/bin/sort /app/public/converted_csv/benin_sp_list_measurements_15127.csv > /app/public/converted_csv/benin_sp_list_measurements_15127.csv_sorted
[STOP] [2019-10-11 12:57:20] convert_to_csv
[START] [2019-10-11 12:57:20] calculate_delta
[CMD] [2019-10-11 12:57:20] echo "0a" > /app/public/diff/benin_sp_list_refs_15124.diff
[CMD] [2019-10-11 12:57:20] tail -n +1 /app/public/converted_csv/benin_sp_list_refs_15124.csv >> /app/public/diff/benin_sp_list_refs_15124.diff
[CMD] [2019-10-11 12:57:20] echo "." >> /app/public/diff/benin_sp_list_refs_15124.diff
[CMD] [2019-10-11 12:57:20] echo "0a" > /app/public/diff/benin_sp_list_nodes_15125.diff
[CMD] [2019-10-11 12:57:20] tail -n +1 /app/public/converted_csv/benin_sp_list_nodes_15125.csv >> /app/public/diff/benin_sp_list_nodes_15125.diff
[CMD] [2019-10-11 12:57:20] echo "." >> /app/public/diff/benin_sp_list_nodes_15125.diff
[CMD] [2019-10-11 12:57:20] echo "0a" > /app/public/diff/benin_sp_list_occurrences_15126.diff
[CMD] [2019-10-11 12:57:20] tail -n +1 /app/public/converted_csv/benin_sp_list_occurrences_15126.csv >> /app/public/diff/benin_sp_list_occurrences_15126.diff
[CMD] [2019-10-11 12:57:20] echo "." >> /app/public/diff/benin_sp_list_occurrences_15126.diff
[CMD] [2019-10-11 12:57:20] echo "0a" > /app/public/diff/benin_sp_list_measurements_15127.diff
[CMD] [2019-10-11 12:57:21] tail -n +1 /app/public/converted_csv/benin_sp_list_measurements_15127.csv >> /app/public/diff/benin_sp_list_measurements_15127.diff
[CMD] [2019-10-11 12:57:21] echo "." >> /app/public/diff/benin_sp_list_measurements_15127.diff
[STOP] [2019-10-11 12:57:21] calculate_delta
[START] [2019-10-11 12:57:21] parse_diff_and_store
[INFO] [2019-10-11 12:57:21] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-11 12:57:21] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-11 12:57:24] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-11 12:57:25] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-11 12:57:56] Storing 2 References
[INFO] [2019-10-11 12:57:56] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-11 12:57:56] Average Time: 0.0
[INFO] [2019-10-11 12:57:56] Total Time: 1s
[INFO] [2019-10-11 12:57:56] Storing 8036 ScientificNames
[INFO] [2019-10-11 12:57:56] Processing group of 8036 in 9 groups of 1000
[INFO] [2019-10-11 12:58:00] Average Time: 0.378
[INFO] [2019-10-11 12:58:00] Total Time: 4s
[INFO] [2019-10-11 12:58:00] last 3 / first 3: 0.64
[INFO] [2019-10-11 12:58:00] Std.Dev: 0.15165750888103102; Max: 0.57
[INFO] [2019-10-11 12:58:00] Storing 8036 Nodes
[INFO] [2019-10-11 12:58:00] Processing group of 8036 in 9 groups of 1000
[INFO] [2019-10-11 12:58:02] Average Time: 0.299
[INFO] [2019-10-11 12:58:02] Total Time: 3s
[INFO] [2019-10-11 12:58:02] last 3 / first 3: 0.75
[INFO] [2019-10-11 12:58:02] Std.Dev: 0.10954451150103323; Max: 0.4
[INFO] [2019-10-11 12:58:02] Storing 5302 Occurrences
[INFO] [2019-10-11 12:58:02] Processing group of 5302 in 6 groups of 1000
[INFO] [2019-10-11 12:58:03] Average Time: 0.102
[INFO] [2019-10-11 12:58:03] Total Time: 1s
[INFO] [2019-10-11 12:58:03] Storing 10768 TraitsReferences
[INFO] [2019-10-11 12:58:03] Processing group of 10768 in 11 groups of 1000
[INFO] [2019-10-11 12:58:04] Average Time: 0.085
[INFO] [2019-10-11 12:58:04] Total Time: 1s
[INFO] [2019-10-11 12:58:04] last 3 / first 3: 0.67
[INFO] [2019-10-11 12:58:04] Std.Dev: 0.03162277660168379; Max: 0.17
[INFO] [2019-10-11 12:58:04] Storing 10767 Traits
[INFO] [2019-10-11 12:58:04] Processing group of 10767 in 11 groups of 1000
[INFO] [2019-10-11 12:58:09] Average Time: 0.486
[INFO] [2019-10-11 12:58:09] Total Time: 6s
[INFO] [2019-10-11 12:58:09] last 3 / first 3: 0.81
[INFO] [2019-10-11 12:58:09] Std.Dev: 0.27386127875258304; Max: 1.24
[INFO] [2019-10-11 12:58:09] Storing 10761 MetaTraits
[INFO] [2019-10-11 12:58:09] Processing group of 10761 in 11 groups of 1000
[INFO] [2019-10-11 12:58:11] Average Time: 0.143
[INFO] [2019-10-11 12:58:11] Total Time: 2s
[INFO] [2019-10-11 12:58:11] last 3 / first 3: 0.86
[INFO] [2019-10-11 12:58:11] Std.Dev: 0.03162277660168379; Max: 0.2
[STOP] [2019-10-11 12:58:11] parse_diff_and_store
[START] [2019-10-11 12:58:11] resolve_keys
[INFO] [2019-10-11 12:58:41] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-11 12:58:45] traits to occurrences...
[INFO] [2019-10-11 12:58:49] traits to nodes (through occurrences)...
[INFO] [2019-10-11 12:58:49] Traits to sex term...
[INFO] [2019-10-11 12:58:53] Traits to lifestage term...
[INFO] [2019-10-11 12:58:56] MetaTraits to traits...
[INFO] [2019-10-11 12:58:57] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-11 12:58:58] Assocs to occurrences...
[INFO] [2019-10-11 12:58:58] Assocs to nodes...
[INFO] [2019-10-11 12:58:58] Assoc to sex term...
[INFO] [2019-10-11 12:58:58] Assoc to lifestage term...
[STOP] [2019-10-11 12:58:58] resolve_keys
[START] [2019-10-11 12:58:58] hold_for_later_1
[STOP] [2019-10-11 12:58:58] hold_for_later_1
[START] [2019-10-11 12:58:58] hold_for_later_2
[STOP] [2019-10-11 12:58:58] hold_for_later_2
[START] [2019-10-11 12:58:58] resolve_missing_parents
[STOP] [2019-10-11 12:59:13] resolve_missing_parents
[START] [2019-10-11 12:59:13] rebuild_nodes
[START] [2019-10-11 12:59:13] Flattener#flatten
[START] [2019-10-11 12:59:13] Flattener#study_resource
[START] [2019-10-11 12:59:13] Flattener#build_ancestry
[STOP] [2019-10-11 12:59:14] Flattener#build_ancestry
[INFO] [2019-10-11 12:59:14] 8036 ancestry keys
[START] [2019-10-11 12:59:14] build_node_ancestors
[INFO] [2019-10-11 12:59:14] old ancestors deleted.
[STOP] [2019-10-11 12:59:15] build_node_ancestors
[START] [2019-10-11 12:59:16] Flattener#propagate_ancestor_ids
[STOP] [2019-10-11 12:59:16] Flattener#propagate_ancestor_ids
[STOP] [2019-10-11 12:59:16] Flattener#flatten
[STOP] [2019-10-11 12:59:16] rebuild_nodes
[START] [2019-10-11 12:59:16] resolve_missing_media_owners
[STOP] [2019-10-11 12:59:16] resolve_missing_media_owners
[START] [2019-10-11 12:59:16] sanitize_media_verbatims
[STOP] [2019-10-11 12:59:16] sanitize_media_verbatims
[START] [2019-10-11 12:59:16] queue_downloads
[STOP] [2019-10-11 12:59:16] queue_downloads
[START] [2019-10-11 12:59:16] parse_names
[WARN] [2019-10-11 12:59:16] I see 8036 names which still need to be parsed.
[STOP] [2019-10-11 12:59:24] parse_names
[START] [2019-10-11 12:59:24] denormalize_canonical_names_to_nodes
[STOP] [2019-10-11 12:59:24] denormalize_canonical_names_to_nodes
[START] [2019-10-11 12:59:24] match_nodes
[START] [2019-10-11 12:59:24] map_all_nodes_to_pages
[STOP] [2019-10-11 13:03:27] map_all_nodes_to_pages
[INFO] [2019-10-11 13:03:27] 428 Unmatched nodes (of 8036)! That's too many to output. First 10: Vitellaria paradoxum (#48766454); Anogeissus (#48759198); Combretum indica (#48765696); Pteleopsis kerstingii (#48766805); Bauhinia reticulatum (#48759220); Albizia altissimum (#48766779); Ormocarpum bibracteatum (#48765445); Acacia dudgeoni (#48761191); Acacia hockii (#48762836); Acacia arabica (#48765840)
[START] [2019-10-11 13:03:27] update_nodes
[STOP] [2019-10-11 13:03:30] update_nodes
[STOP] [2019-10-11 13:03:30] match_nodes
[START] [2019-10-11 13:03:30] reindex_search
[STOP] [2019-10-11 13:03:43] reindex_search
[START] [2019-10-11 13:03:43] normalize_units
[STOP] [2019-10-11 13:03:43] normalize_units
[START] [2019-10-11 13:03:43] calculate_statistics
[STOP] [2019-10-11 13:03:43] calculate_statistics
[START] [2019-10-11 13:03:43] complete_harvest_instance
[START] [2019-10-11 13:03:43] overall_tsv_creation
[INFO] [2019-10-11 13:03:43] Processing group of 8036 in 1 batches of 10000
[INFO] [2019-10-11 13:05:13] 5302 Traits (unfiltered)...
[INFO] [2019-10-11 13:05:29] 5302 Traits (filtered)...
[INFO] [2019-10-11 13:05:29] 0 Associations (filtered)...
[INFO] [2019-10-11 13:06:17] 26503 metadata added.
[INFO] [2019-10-11 13:06:17] 0 metadata added.
[INFO] [2019-10-11 13:06:17] Average Time: 118.75
[INFO] [2019-10-11 13:06:17] Total Time: 2m35s
[STOP] [2019-10-11 13:06:17] overall_tsv_creation
[INFO] [2019-10-11 13:06:17] Done. Check your files:
[INFO] [2019-10-11 13:06:18] (8036 lines) /app/public/data/benin_sp_list/publish_nodes.tsv
[INFO] [2019-10-11 13:06:18] (12817 lines) /app/public/data/benin_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-11 13:06:18] (8036 lines) /app/public/data/benin_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-11 13:06:18] (5303 lines) /app/public/data/benin_sp_list/publish_traits.tsv
[INFO] [2019-10-11 13:06:18] (26504 lines) /app/public/data/benin_sp_list/publish_metadata.tsv
[STOP] [2019-10-11 13:06:18] complete_harvest_instance
[START] [2019-10-11 13:06:18] completed
[STOP] [2019-10-11 13:06:18] completed
[STOP] [2019-10-11 13:06:18] logged process, took 540.91
Latest Process