Stage:
completed
Fetched:
15 Oct 14:15
Validated:
15 Oct 14:15
Deltas Created
15 Oct 14:15
Units Normalized:
15 Oct 14:22
Ancestry Built:
15 Oct 14:16
Nodes Matched:
15 Oct 14:21
Names Parsed:
15 Oct 14:16
New Models Stored:
15 Oct 14:16
Indexed:
15 Oct 14:22
Completed:
15 Oct 14:24
Time to Harvest:
less than a minute
Harvesting Log
(139 lines)
# Logfile created on 2019-10-15 14:15:33 -0400 by logger.rb/56815
[START] [2019-10-15 14:15:33] logged process
[START] [2019-10-15 14:15:33] create_harvest_instance
[STOP] [2019-10-15 14:15:34] create_harvest_instance
[START] [2019-10-15 14:15:34] fetch_files
[STOP] [2019-10-15 14:15:34] fetch_files
[START] [2019-10-15 14:15:34] validate_each_file
[STOP] [2019-10-15 14:15:34] validate_each_file
[START] [2019-10-15 14:15:34] convert_to_csv
[CMD] [2019-10-15 14:15:34] /usr/bin/sort /app/public/converted_csv/r_union_sp_list_refs_16981.csv > /app/public/converted_csv/r_union_sp_list_refs_16981.csv_sorted
[CMD] [2019-10-15 14:15:35] /usr/bin/sort /app/public/converted_csv/r_union_sp_list_nodes_16982.csv > /app/public/converted_csv/r_union_sp_list_nodes_16982.csv_sorted
[CMD] [2019-10-15 14:15:35] /usr/bin/sort /app/public/converted_csv/r_union_sp_list_occurrences_16983.csv > /app/public/converted_csv/r_union_sp_list_occurrences_16983.csv_sorted
[CMD] [2019-10-15 14:15:35] /usr/bin/sort /app/public/converted_csv/r_union_sp_list_measurements_16984.csv > /app/public/converted_csv/r_union_sp_list_measurements_16984.csv_sorted
[STOP] [2019-10-15 14:15:36] convert_to_csv
[START] [2019-10-15 14:15:36] calculate_delta
[CMD] [2019-10-15 14:15:36] echo "0a" > /app/public/diff/r_union_sp_list_refs_16981.diff
[CMD] [2019-10-15 14:15:36] tail -n +1 /app/public/converted_csv/r_union_sp_list_refs_16981.csv >> /app/public/diff/r_union_sp_list_refs_16981.diff
[CMD] [2019-10-15 14:15:36] echo "." >> /app/public/diff/r_union_sp_list_refs_16981.diff
[CMD] [2019-10-15 14:15:36] echo "0a" > /app/public/diff/r_union_sp_list_nodes_16982.diff
[CMD] [2019-10-15 14:15:37] tail -n +1 /app/public/converted_csv/r_union_sp_list_nodes_16982.csv >> /app/public/diff/r_union_sp_list_nodes_16982.diff
[CMD] [2019-10-15 14:15:37] echo "." >> /app/public/diff/r_union_sp_list_nodes_16982.diff
[CMD] [2019-10-15 14:15:37] echo "0a" > /app/public/diff/r_union_sp_list_occurrences_16983.diff
[CMD] [2019-10-15 14:15:37] tail -n +1 /app/public/converted_csv/r_union_sp_list_occurrences_16983.csv >> /app/public/diff/r_union_sp_list_occurrences_16983.diff
[CMD] [2019-10-15 14:15:38] echo "." >> /app/public/diff/r_union_sp_list_occurrences_16983.diff
[CMD] [2019-10-15 14:15:38] echo "0a" > /app/public/diff/r_union_sp_list_measurements_16984.diff
[CMD] [2019-10-15 14:15:38] tail -n +1 /app/public/converted_csv/r_union_sp_list_measurements_16984.csv >> /app/public/diff/r_union_sp_list_measurements_16984.diff
[CMD] [2019-10-15 14:15:39] echo "." >> /app/public/diff/r_union_sp_list_measurements_16984.diff
[STOP] [2019-10-15 14:15:39] calculate_delta
[START] [2019-10-15 14:15:39] parse_diff_and_store
[INFO] [2019-10-15 14:15:39] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-15 14:15:39] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-15 14:15:42] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-15 14:15:42] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-15 14:15:59] Storing 2 References
[INFO] [2019-10-15 14:15:59] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-15 14:15:59] Average Time: 0.0
[INFO] [2019-10-15 14:15:59] Total Time: 1s
[INFO] [2019-10-15 14:15:59] Storing 5102 ScientificNames
[INFO] [2019-10-15 14:15:59] Processing group of 5102 in 6 groups of 1000
[INFO] [2019-10-15 14:16:01] Average Time: 0.328
[INFO] [2019-10-15 14:16:01] Total Time: 2s
[INFO] [2019-10-15 14:16:01] Storing 5102 Nodes
[INFO] [2019-10-15 14:16:01] Processing group of 5102 in 6 groups of 1000
[INFO] [2019-10-15 14:16:03] Average Time: 0.308
[INFO] [2019-10-15 14:16:03] Total Time: 2s
[INFO] [2019-10-15 14:16:03] Storing 2410 Occurrences
[INFO] [2019-10-15 14:16:03] Processing group of 2410 in 3 groups of 1000
[INFO] [2019-10-15 14:16:03] Average Time: 0.09
[INFO] [2019-10-15 14:16:03] Total Time: 1s
[INFO] [2019-10-15 14:16:03] Storing 5740 TraitsReferences
[INFO] [2019-10-15 14:16:03] Processing group of 5740 in 6 groups of 1000
[INFO] [2019-10-15 14:16:03] Average Time: 0.077
[INFO] [2019-10-15 14:16:03] Total Time: 1s
[INFO] [2019-10-15 14:16:03] Storing 5739 Traits
[INFO] [2019-10-15 14:16:03] Processing group of 5739 in 6 groups of 1000
[INFO] [2019-10-15 14:16:05] Average Time: 0.322
[INFO] [2019-10-15 14:16:05] Total Time: 2s
[INFO] [2019-10-15 14:16:05] Storing 5739 MetaTraits
[INFO] [2019-10-15 14:16:05] Processing group of 5739 in 6 groups of 1000
[INFO] [2019-10-15 14:16:06] Average Time: 0.133
[INFO] [2019-10-15 14:16:06] Total Time: 1s
[STOP] [2019-10-15 14:16:06] parse_diff_and_store
[START] [2019-10-15 14:16:06] resolve_keys
[INFO] [2019-10-15 14:16:30] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-15 14:16:32] traits to occurrences...
[INFO] [2019-10-15 14:16:34] traits to nodes (through occurrences)...
[INFO] [2019-10-15 14:16:34] Traits to sex term...
[INFO] [2019-10-15 14:16:37] Traits to lifestage term...
[INFO] [2019-10-15 14:16:38] MetaTraits to traits...
[INFO] [2019-10-15 14:16:39] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-15 14:16:40] Assocs to occurrences...
[INFO] [2019-10-15 14:16:40] Assocs to nodes...
[INFO] [2019-10-15 14:16:40] Assoc to sex term...
[INFO] [2019-10-15 14:16:40] Assoc to lifestage term...
[STOP] [2019-10-15 14:16:40] resolve_keys
[START] [2019-10-15 14:16:40] hold_for_later_1
[STOP] [2019-10-15 14:16:40] hold_for_later_1
[START] [2019-10-15 14:16:40] hold_for_later_2
[STOP] [2019-10-15 14:16:40] hold_for_later_2
[START] [2019-10-15 14:16:40] resolve_missing_parents
[STOP] [2019-10-15 14:16:51] resolve_missing_parents
[START] [2019-10-15 14:16:51] rebuild_nodes
[START] [2019-10-15 14:16:51] Flattener#flatten
[START] [2019-10-15 14:16:51] Flattener#study_resource
[START] [2019-10-15 14:16:51] Flattener#build_ancestry
[STOP] [2019-10-15 14:16:51] Flattener#build_ancestry
[INFO] [2019-10-15 14:16:51] 5102 ancestry keys
[START] [2019-10-15 14:16:51] build_node_ancestors
[INFO] [2019-10-15 14:16:51] old ancestors deleted.
[STOP] [2019-10-15 14:16:52] build_node_ancestors
[START] [2019-10-15 14:16:52] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 14:16:53] Flattener#propagate_ancestor_ids
[STOP] [2019-10-15 14:16:53] Flattener#flatten
[STOP] [2019-10-15 14:16:53] rebuild_nodes
[START] [2019-10-15 14:16:53] resolve_missing_media_owners
[STOP] [2019-10-15 14:16:53] resolve_missing_media_owners
[START] [2019-10-15 14:16:53] sanitize_media_verbatims
[STOP] [2019-10-15 14:16:53] sanitize_media_verbatims
[START] [2019-10-15 14:16:53] queue_downloads
[STOP] [2019-10-15 14:16:53] queue_downloads
[START] [2019-10-15 14:16:53] parse_names
[WARN] [2019-10-15 14:16:53] I see 5102 names which still need to be parsed.
[STOP] [2019-10-15 14:16:57] parse_names
[START] [2019-10-15 14:16:57] denormalize_canonical_names_to_nodes
[STOP] [2019-10-15 14:16:57] denormalize_canonical_names_to_nodes
[START] [2019-10-15 14:16:57] match_nodes
[START] [2019-10-15 14:16:57] map_all_nodes_to_pages
[STOP] [2019-10-15 14:21:52] map_all_nodes_to_pages
[INFO] [2019-10-15 14:21:52] 257 Unmatched nodes (of 5102)! That's too many to output. First 10: Mesosphaerum pectinata (#51698696); Rosmarinus (#51696619); Laphangium pallidum (#51697430); Seriphium (#51695798); Launaea sarmentosus (#51698767); Combretum indica (#51696452); Pteridium aquilinum (#51694257); Pteris cretica (#51695000); Monogramma (#51695817); Actiniopteris dimorpha (#51698190)
[START] [2019-10-15 14:21:52] update_nodes
[STOP] [2019-10-15 14:21:54] update_nodes
[STOP] [2019-10-15 14:21:54] match_nodes
[START] [2019-10-15 14:21:54] reindex_search
[STOP] [2019-10-15 14:22:08] reindex_search
[START] [2019-10-15 14:22:08] normalize_units
[STOP] [2019-10-15 14:22:08] normalize_units
[START] [2019-10-15 14:22:08] calculate_statistics
[STOP] [2019-10-15 14:22:08] calculate_statistics
[START] [2019-10-15 14:22:08] complete_harvest_instance
[START] [2019-10-15 14:22:08] overall_tsv_creation
[INFO] [2019-10-15 14:22:08] Processing group of 5102 in 1 batches of 10000
[INFO] [2019-10-15 14:23:15] 2410 Traits (unfiltered)...
[INFO] [2019-10-15 14:23:28] 2410 Traits (filtered)...
[INFO] [2019-10-15 14:23:29] 0 Associations (filtered)...
[INFO] [2019-10-15 14:24:11] 12049 metadata added.
[INFO] [2019-10-15 14:24:11] 0 metadata added.
[INFO] [2019-10-15 14:24:11] Average Time: 98.6
[INFO] [2019-10-15 14:24:11] Total Time: 2m3s
[STOP] [2019-10-15 14:24:11] overall_tsv_creation
[INFO] [2019-10-15 14:24:11] Done. Check your files:
[INFO] [2019-10-15 14:24:11] (5102 lines) /app/public/data/r_union_sp_list/publish_nodes.tsv
[INFO] [2019-10-15 14:24:11] (9699 lines) /app/public/data/r_union_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-15 14:24:12] (5102 lines) /app/public/data/r_union_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-15 14:24:12] (2411 lines) /app/public/data/r_union_sp_list/publish_traits.tsv
[INFO] [2019-10-15 14:24:12] (12050 lines) /app/public/data/r_union_sp_list/publish_metadata.tsv
[STOP] [2019-10-15 14:24:12] complete_harvest_instance
[START] [2019-10-15 14:24:12] completed
[STOP] [2019-10-15 14:24:12] completed
[STOP] [2019-10-15 14:24:12] logged process, took 519.1
Latest Process