Harvest for Symonds and Tattersall, 2010 Created 21 Jan 13:12

Stage: completed
Fetched: 21 Jan 13:12
Validated: 21 Jan 13:12
Deltas Created 21 Jan 13:12
Units Normalized: 21 Jan 13:14
Ancestry Built: 21 Jan 13:13
Nodes Matched: 21 Jan 13:14
Names Parsed: 21 Jan 13:13
New Models Stored: 21 Jan 13:13
Indexed: 21 Jan 13:14
Completed: 21 Jan 13:15
Time to Harvest: less than a minute

Harvesting Log

(159 lines)
# Logfile created on 2020-01-21 13:12:22 -0500 by logger.rb/56815
[START] [2020-01-21 13:12:22] logged process
[START] [2020-01-21 13:12:22] create_harvest_instance
[STOP] [2020-01-21 13:12:25] create_harvest_instance
[START] [2020-01-21 13:12:25] fetch_files
[STOP] [2020-01-21 13:12:25] fetch_files
[START] [2020-01-21 13:12:25] validate_each_file
[STOP] [2020-01-21 13:12:26] validate_each_file
[START] [2020-01-21 13:12:26] convert_to_csv
[CMD] [2020-01-21 13:12:26] /usr/bin/sort /app/public/converted_csv/symonds-tatters_agents_20056.csv > /app/public/converted_csv/symonds-tatters_agents_20056.csv_sorted
[CMD] [2020-01-21 13:12:26] /usr/bin/sort /app/public/converted_csv/symonds-tatters_refs_20057.csv > /app/public/converted_csv/symonds-tatters_refs_20057.csv_sorted
[CMD] [2020-01-21 13:12:27] /usr/bin/sort /app/public/converted_csv/symonds-tatters_nodes_20058.csv > /app/public/converted_csv/symonds-tatters_nodes_20058.csv_sorted
[CMD] [2020-01-21 13:12:28] /usr/bin/sort /app/public/converted_csv/symonds-tatters_media_20059.csv > /app/public/converted_csv/symonds-tatters_media_20059.csv_sorted
[CMD] [2020-01-21 13:12:28] /usr/bin/sort /app/public/converted_csv/symonds-tatters_vernaculars_20060.csv > /app/public/converted_csv/symonds-tatters_vernaculars_20060.csv_sorted
[CMD] [2020-01-21 13:12:29] /usr/bin/sort /app/public/converted_csv/symonds-tatters_occurrences_20061.csv > /app/public/converted_csv/symonds-tatters_occurrences_20061.csv_sorted
[CMD] [2020-01-21 13:12:30] /usr/bin/sort /app/public/converted_csv/symonds-tatters_assocs_20062.csv > /app/public/converted_csv/symonds-tatters_assocs_20062.csv_sorted
[CMD] [2020-01-21 13:12:31] /usr/bin/sort /app/public/converted_csv/symonds-tatters_measurements_20063.csv > /app/public/converted_csv/symonds-tatters_measurements_20063.csv_sorted
[STOP] [2020-01-21 13:12:31] convert_to_csv
[START] [2020-01-21 13:12:31] calculate_delta
[CMD] [2020-01-21 13:12:31] echo "0a" > /app/public/diff/symonds-tatters_agents_20056.diff
[CMD] [2020-01-21 13:12:32] tail -n +1 /app/public/converted_csv/symonds-tatters_agents_20056.csv >> /app/public/diff/symonds-tatters_agents_20056.diff
[CMD] [2020-01-21 13:12:33] echo "." >> /app/public/diff/symonds-tatters_agents_20056.diff
[CMD] [2020-01-21 13:12:33] echo "0a" > /app/public/diff/symonds-tatters_refs_20057.diff
[CMD] [2020-01-21 13:12:34] tail -n +1 /app/public/converted_csv/symonds-tatters_refs_20057.csv >> /app/public/diff/symonds-tatters_refs_20057.diff
[CMD] [2020-01-21 13:12:35] echo "." >> /app/public/diff/symonds-tatters_refs_20057.diff
[CMD] [2020-01-21 13:12:35] echo "0a" > /app/public/diff/symonds-tatters_nodes_20058.diff
[CMD] [2020-01-21 13:12:36] tail -n +1 /app/public/converted_csv/symonds-tatters_nodes_20058.csv >> /app/public/diff/symonds-tatters_nodes_20058.diff
[CMD] [2020-01-21 13:12:37] echo "." >> /app/public/diff/symonds-tatters_nodes_20058.diff
[CMD] [2020-01-21 13:12:37] echo "0a" > /app/public/diff/symonds-tatters_media_20059.diff
[CMD] [2020-01-21 13:12:38] tail -n +1 /app/public/converted_csv/symonds-tatters_media_20059.csv >> /app/public/diff/symonds-tatters_media_20059.diff
[CMD] [2020-01-21 13:12:39] echo "." >> /app/public/diff/symonds-tatters_media_20059.diff
[CMD] [2020-01-21 13:12:39] echo "0a" > /app/public/diff/symonds-tatters_vernaculars_20060.diff
[CMD] [2020-01-21 13:12:40] tail -n +1 /app/public/converted_csv/symonds-tatters_vernaculars_20060.csv >> /app/public/diff/symonds-tatters_vernaculars_20060.diff
[CMD] [2020-01-21 13:12:41] echo "." >> /app/public/diff/symonds-tatters_vernaculars_20060.diff
[CMD] [2020-01-21 13:12:42] echo "0a" > /app/public/diff/symonds-tatters_occurrences_20061.diff
[CMD] [2020-01-21 13:12:42] tail -n +1 /app/public/converted_csv/symonds-tatters_occurrences_20061.csv >> /app/public/diff/symonds-tatters_occurrences_20061.diff
[CMD] [2020-01-21 13:12:43] echo "." >> /app/public/diff/symonds-tatters_occurrences_20061.diff
[CMD] [2020-01-21 13:12:44] echo "0a" > /app/public/diff/symonds-tatters_assocs_20062.diff
[CMD] [2020-01-21 13:12:44] tail -n +1 /app/public/converted_csv/symonds-tatters_assocs_20062.csv >> /app/public/diff/symonds-tatters_assocs_20062.diff
[CMD] [2020-01-21 13:12:45] echo "." >> /app/public/diff/symonds-tatters_assocs_20062.diff
[CMD] [2020-01-21 13:12:46] echo "0a" > /app/public/diff/symonds-tatters_measurements_20063.diff
[CMD] [2020-01-21 13:12:46] tail -n +1 /app/public/converted_csv/symonds-tatters_measurements_20063.csv >> /app/public/diff/symonds-tatters_measurements_20063.diff
[CMD] [2020-01-21 13:12:47] echo "." >> /app/public/diff/symonds-tatters_measurements_20063.diff
[STOP] [2020-01-21 13:12:48] calculate_delta
[START] [2020-01-21 13:12:48] parse_diff_and_store
[INFO] [2020-01-21 13:12:48] Loading agents diff file into memory (true lines)...
[INFO] [2020-01-21 13:12:49] Loading refs diff file into memory (true lines)...
[INFO] [2020-01-21 13:12:50] Loading nodes diff file into memory (true lines)...
[INFO] [2020-01-21 13:12:51] Loading media diff file into memory (true lines)...
[INFO] [2020-01-21 13:12:51] Loading vernaculars diff file into memory (true lines)...
[INFO] [2020-01-21 13:12:52] Loading occurrences diff file into memory (true lines)...
[INFO] [2020-01-21 13:12:53] Loading assocs diff file into memory (true lines)...
[INFO] [2020-01-21 13:12:53] Loading measurements diff file into memory (true lines)...
[INFO] [2020-01-21 13:13:04] Storing 14 References
[INFO] [2020-01-21 13:13:04] Processing group of 14 in 1 groups of 1000
[INFO] [2020-01-21 13:13:04] Average Time: 0.0
[INFO] [2020-01-21 13:13:04] Total Time: 1s
[INFO] [2020-01-21 13:13:04] Storing 294 ScientificNames
[INFO] [2020-01-21 13:13:04] Processing group of 294 in 1 groups of 1000
[INFO] [2020-01-21 13:13:04] Average Time: 0.2
[INFO] [2020-01-21 13:13:04] Total Time: 1s
[INFO] [2020-01-21 13:13:04] Storing 294 Nodes
[INFO] [2020-01-21 13:13:04] Processing group of 294 in 1 groups of 1000
[INFO] [2020-01-21 13:13:04] Average Time: 0.15
[INFO] [2020-01-21 13:13:04] Total Time: 1s
[INFO] [2020-01-21 13:13:04] Storing 215 Occurrences
[INFO] [2020-01-21 13:13:04] Processing group of 215 in 1 groups of 1000
[INFO] [2020-01-21 13:13:04] Average Time: 0.04
[INFO] [2020-01-21 13:13:04] Total Time: 1s
[INFO] [2020-01-21 13:13:04] Storing 3166 TraitsReferences
[INFO] [2020-01-21 13:13:04] Processing group of 3166 in 4 groups of 1000
[INFO] [2020-01-21 13:13:05] Average Time: 0.083
[INFO] [2020-01-21 13:13:05] Total Time: 1s
[INFO] [2020-01-21 13:13:05] Storing 1136 Traits
[INFO] [2020-01-21 13:13:05] Processing group of 1136 in 2 groups of 1000
[INFO] [2020-01-21 13:13:05] Average Time: 0.265
[INFO] [2020-01-21 13:13:05] Total Time: 1s
[INFO] [2020-01-21 13:13:05] Storing 4891 MetaTraits
[INFO] [2020-01-21 13:13:05] Processing group of 4891 in 5 groups of 1000
[INFO] [2020-01-21 13:13:06] Average Time: 0.12
[INFO] [2020-01-21 13:13:06] Total Time: 1s
[STOP] [2020-01-21 13:13:06] parse_diff_and_store
[START] [2020-01-21 13:13:06] resolve_keys
[INFO] [2020-01-21 13:13:10] Occurrences to nodes (through scientific_names)...
[INFO] [2020-01-21 13:13:10] traits to occurrences...
[INFO] [2020-01-21 13:13:10] traits to nodes (through occurrences)...
[INFO] [2020-01-21 13:13:10] Traits to sex term...
[INFO] [2020-01-21 13:13:10] Traits to lifestage term...
[INFO] [2020-01-21 13:13:10] MetaTraits to traits...
[INFO] [2020-01-21 13:13:11] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2020-01-21 13:13:11] Assocs to occurrences...
[INFO] [2020-01-21 13:13:11] Assocs to nodes...
[INFO] [2020-01-21 13:13:11] Assoc to sex term...
[INFO] [2020-01-21 13:13:11] Assoc to lifestage term...
[STOP] [2020-01-21 13:13:11] resolve_keys
[START] [2020-01-21 13:13:11] hold_for_later_1
[STOP] [2020-01-21 13:13:11] hold_for_later_1
[START] [2020-01-21 13:13:11] hold_for_later_2
[STOP] [2020-01-21 13:13:11] hold_for_later_2
[START] [2020-01-21 13:13:11] resolve_missing_parents
[STOP] [2020-01-21 13:13:11] resolve_missing_parents
[START] [2020-01-21 13:13:11] rebuild_nodes
[START] [2020-01-21 13:13:11] Flattener#flatten
[START] [2020-01-21 13:13:11] Flattener#study_resource
[START] [2020-01-21 13:13:11] Flattener#build_ancestry
[STOP] [2020-01-21 13:13:11] Flattener#build_ancestry
[INFO] [2020-01-21 13:13:11] 294 ancestry keys
[START] [2020-01-21 13:13:11] build_node_ancestors
[INFO] [2020-01-21 13:13:11] old ancestors deleted.
[STOP] [2020-01-21 13:13:11] build_node_ancestors
[START] [2020-01-21 13:13:11] Flattener#propagate_ancestor_ids
[STOP] [2020-01-21 13:13:11] Flattener#propagate_ancestor_ids
[STOP] [2020-01-21 13:13:11] Flattener#flatten
[STOP] [2020-01-21 13:13:11] rebuild_nodes
[START] [2020-01-21 13:13:11] resolve_missing_media_owners
[STOP] [2020-01-21 13:13:11] resolve_missing_media_owners
[START] [2020-01-21 13:13:11] sanitize_media_verbatims
[STOP] [2020-01-21 13:13:11] sanitize_media_verbatims
[START] [2020-01-21 13:13:11] queue_downloads
[STOP] [2020-01-21 13:13:11] queue_downloads
[START] [2020-01-21 13:13:11] parse_names
[WARN] [2020-01-21 13:13:11] I see 294 names which still need to be parsed.
[STOP] [2020-01-21 13:13:13] parse_names
[START] [2020-01-21 13:13:13] denormalize_canonical_names_to_nodes
[STOP] [2020-01-21 13:13:13] denormalize_canonical_names_to_nodes
[START] [2020-01-21 13:13:13] match_nodes
[START] [2020-01-21 13:13:13] map_all_nodes_to_pages
[STOP] [2020-01-21 13:14:06] map_all_nodes_to_pages
[INFO] [2020-01-21 13:14:06] 34 Unmatched nodes (of 294)! That's too many to output. First 10: Cacatua leadbeateri (#62906647); Cacatua roseicapilla (#62906649); Glossopsitta porphyrocephala (#62906699); Glossopsitta pusilla (#62906700); Polytelis alexandriae (#62906806); Psephotus chrysopterygius (#62906812); Psephotus dissimilis (#62906813); Psephotus haemanotus (#62906814); Psephotus varius (#62906815); Purpureicephala (#62906829)
[START] [2020-01-21 13:14:06] update_nodes
[STOP] [2020-01-21 13:14:06] update_nodes
[STOP] [2020-01-21 13:14:06] match_nodes
[START] [2020-01-21 13:14:06] reindex_search
[STOP] [2020-01-21 13:14:07] reindex_search
[START] [2020-01-21 13:14:07] normalize_units
[STOP] [2020-01-21 13:14:11] normalize_units
[START] [2020-01-21 13:14:11] calculate_statistics
[STOP] [2020-01-21 13:14:11] calculate_statistics
[START] [2020-01-21 13:14:11] complete_harvest_instance
[START] [2020-01-21 13:14:11] overall_tsv_creation
[INFO] [2020-01-21 13:14:11] Processing group of 294 in 1 batches of 10000
[INFO] [2020-01-21 13:14:57] 1136 Traits (unfiltered)...
[INFO] [2020-01-21 13:15:10] 1136 Traits (filtered)...
[INFO] [2020-01-21 13:15:10] 0 Associations (filtered)...
[INFO] [2020-01-21 13:15:49] 8051 metadata added.
[INFO] [2020-01-21 13:15:49] 0 metadata added.
[INFO] [2020-01-21 13:15:49] Average Time: 75.52
[INFO] [2020-01-21 13:15:49] Total Time: 1m39s
[STOP] [2020-01-21 13:15:49] overall_tsv_creation
[INFO] [2020-01-21 13:15:49] Done. Check your files:
[INFO] [2020-01-21 13:15:50] (294 lines) /app/public/data/symonds-tatters/publish_nodes.tsv
[INFO] [2020-01-21 13:15:51] (793 lines) /app/public/data/symonds-tatters/publish_node_ancestors.tsv
[INFO] [2020-01-21 13:15:51] (294 lines) /app/public/data/symonds-tatters/publish_scientific_names.tsv
[INFO] [2020-01-21 13:15:52] (1137 lines) /app/public/data/symonds-tatters/publish_traits.tsv
[INFO] [2020-01-21 13:15:53] (8052 lines) /app/public/data/symonds-tatters/publish_metadata.tsv
[STOP] [2020-01-21 13:15:53] complete_harvest_instance
[START] [2020-01-21 13:15:53] completed
[STOP] [2020-01-21 13:15:53] completed
[STOP] [2020-01-21 13:15:53] logged process, took 211.13

Latest Process