Harvest for Zimbabwe Species List Created 17 Oct 11:43

Stage: completed
Fetched: 17 Oct 11:43
Validated: 17 Oct 11:43
Deltas Created 17 Oct 11:43
Units Normalized: 17 Oct 11:53
Ancestry Built: 17 Oct 11:46
Nodes Matched: 17 Oct 11:53
Names Parsed: 17 Oct 11:46
New Models Stored: 17 Oct 11:44
Indexed: 17 Oct 11:53
Completed: 17 Oct 11:57
Time to Harvest: less than a minute

Harvesting Log (most recent first)

# Logfile created on 2019-10-17 11:43:47 -0400 by logger.rb/56815
[START] [2019-10-17 11:43:47] logged process
[START] [2019-10-17 11:43:47] create_harvest_instance
[STOP] [2019-10-17 11:43:48] create_harvest_instance
[START] [2019-10-17 11:43:48] fetch_files
[STOP] [2019-10-17 11:43:48] fetch_files
[START] [2019-10-17 11:43:48] validate_each_file
[STOP] [2019-10-17 11:43:49] validate_each_file
[START] [2019-10-17 11:43:49] convert_to_csv
[CMD] [2019-10-17 11:43:49] /usr/bin/sort /app/public/converted_csv/zimbabwe_sp_list_refs_17761.csv > /app/public/converted_csv/zimbabwe_sp_list_refs_17761.csv_sorted
[CMD] [2019-10-17 11:43:49] /usr/bin/sort /app/public/converted_csv/zimbabwe_sp_list_nodes_17762.csv > /app/public/converted_csv/zimbabwe_sp_list_nodes_17762.csv_sorted
[CMD] [2019-10-17 11:43:49] /usr/bin/sort /app/public/converted_csv/zimbabwe_sp_list_occurrences_17763.csv > /app/public/converted_csv/zimbabwe_sp_list_occurrences_17763.csv_sorted
[CMD] [2019-10-17 11:43:50] /usr/bin/sort /app/public/converted_csv/zimbabwe_sp_list_measurements_17764.csv > /app/public/converted_csv/zimbabwe_sp_list_measurements_17764.csv_sorted
[STOP] [2019-10-17 11:43:50] convert_to_csv
[START] [2019-10-17 11:43:50] calculate_delta
[CMD] [2019-10-17 11:43:50] echo "0a" > /app/public/diff/zimbabwe_sp_list_refs_17761.diff
[CMD] [2019-10-17 11:43:50] tail -n +1 /app/public/converted_csv/zimbabwe_sp_list_refs_17761.csv >> /app/public/diff/zimbabwe_sp_list_refs_17761.diff
[CMD] [2019-10-17 11:43:50] echo "." >> /app/public/diff/zimbabwe_sp_list_refs_17761.diff
[CMD] [2019-10-17 11:43:50] echo "0a" > /app/public/diff/zimbabwe_sp_list_nodes_17762.diff
[CMD] [2019-10-17 11:43:50] tail -n +1 /app/public/converted_csv/zimbabwe_sp_list_nodes_17762.csv >> /app/public/diff/zimbabwe_sp_list_nodes_17762.diff
[CMD] [2019-10-17 11:43:50] echo "." >> /app/public/diff/zimbabwe_sp_list_nodes_17762.diff
[CMD] [2019-10-17 11:43:50] echo "0a" > /app/public/diff/zimbabwe_sp_list_occurrences_17763.diff
[CMD] [2019-10-17 11:43:50] tail -n +1 /app/public/converted_csv/zimbabwe_sp_list_occurrences_17763.csv >> /app/public/diff/zimbabwe_sp_list_occurrences_17763.diff
[CMD] [2019-10-17 11:43:50] echo "." >> /app/public/diff/zimbabwe_sp_list_occurrences_17763.diff
[CMD] [2019-10-17 11:43:50] echo "0a" > /app/public/diff/zimbabwe_sp_list_measurements_17764.diff
[CMD] [2019-10-17 11:43:51] tail -n +1 /app/public/converted_csv/zimbabwe_sp_list_measurements_17764.csv >> /app/public/diff/zimbabwe_sp_list_measurements_17764.diff
[CMD] [2019-10-17 11:43:51] echo "." >> /app/public/diff/zimbabwe_sp_list_measurements_17764.diff
[STOP] [2019-10-17 11:43:51] calculate_delta
[START] [2019-10-17 11:43:51] parse_diff_and_store
[INFO] [2019-10-17 11:43:51] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-17 11:43:51] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-17 11:43:55] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-17 11:43:56] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-17 11:44:39] Storing 2 References
[INFO] [2019-10-17 11:44:39] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-17 11:44:39] Average Time: 0.0
[INFO] [2019-10-17 11:44:39] Total Time: 1s
[INFO] [2019-10-17 11:44:39] Storing 11749 ScientificNames
[INFO] [2019-10-17 11:44:39] Processing group of 11749 in 12 groups of 1000
[INFO] [2019-10-17 11:44:43] Average Time: 0.333
[INFO] [2019-10-17 11:44:43] Total Time: 5s
[INFO] [2019-10-17 11:44:43] last 3 / first 3: 0.79
[INFO] [2019-10-17 11:44:43] Std.Dev: 0.044721359549995794; Max: 0.42
[INFO] [2019-10-17 11:44:43] Storing 11749 Nodes
[INFO] [2019-10-17 11:44:43] Processing group of 11749 in 12 groups of 1000
[INFO] [2019-10-17 11:44:47] Average Time: 0.274
[INFO] [2019-10-17 11:44:47] Total Time: 4s
[INFO] [2019-10-17 11:44:47] last 3 / first 3: 0.93
[INFO] [2019-10-17 11:44:47] Std.Dev: 0.03162277660168379; Max: 0.34
[INFO] [2019-10-17 11:44:47] Storing 7230 Occurrences
[INFO] [2019-10-17 11:44:47] Processing group of 7230 in 8 groups of 1000
[INFO] [2019-10-17 11:44:48] Average Time: 0.093
[INFO] [2019-10-17 11:44:48] Total Time: 1s
[INFO] [2019-10-17 11:44:48] last 3 / first 3: 0.81
[INFO] [2019-10-17 11:44:48] Std.Dev: 0.03162277660168379; Max: 0.16
[INFO] [2019-10-17 11:44:48] Storing 15140 TraitsReferences
[INFO] [2019-10-17 11:44:48] Processing group of 15140 in 16 groups of 1000
[INFO] [2019-10-17 11:44:49] Average Time: 0.067
[INFO] [2019-10-17 11:44:49] Total Time: 2s
[INFO] [2019-10-17 11:44:49] last 3 / first 3: 0.6
[INFO] [2019-10-17 11:44:49] Std.Dev: 0.0; Max: 0.12
[INFO] [2019-10-17 11:44:49] Storing 15139 Traits
[INFO] [2019-10-17 11:44:49] Processing group of 15139 in 16 groups of 1000
[INFO] [2019-10-17 11:44:53] Average Time: 0.271
[INFO] [2019-10-17 11:44:53] Total Time: 5s
[INFO] [2019-10-17 11:44:53] last 3 / first 3: 0.63
[INFO] [2019-10-17 11:44:53] Std.Dev: 0.07745966692414834; Max: 0.39
[INFO] [2019-10-17 11:44:53] Storing 15136 MetaTraits
[INFO] [2019-10-17 11:44:53] Processing group of 15136 in 16 groups of 1000
[INFO] [2019-10-17 11:44:55] Average Time: 0.106
[INFO] [2019-10-17 11:44:55] Total Time: 2s
[INFO] [2019-10-17 11:44:55] last 3 / first 3: 0.73
[INFO] [2019-10-17 11:44:55] Std.Dev: 0.03162277660168379; Max: 0.14
[STOP] [2019-10-17 11:44:55] parse_diff_and_store
[START] [2019-10-17 11:44:55] resolve_keys
[INFO] [2019-10-17 11:45:39] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-17 11:45:43] traits to occurrences...
[INFO] [2019-10-17 11:45:48] traits to nodes (through occurrences)...
[INFO] [2019-10-17 11:45:49] Traits to sex term...
[INFO] [2019-10-17 11:45:53] Traits to lifestage term...
[INFO] [2019-10-17 11:45:58] MetaTraits to traits...
[INFO] [2019-10-17 11:45:59] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-17 11:46:01] Assocs to occurrences...
[INFO] [2019-10-17 11:46:01] Assocs to nodes...
[INFO] [2019-10-17 11:46:01] Assoc to sex term...
[INFO] [2019-10-17 11:46:01] Assoc to lifestage term...
[STOP] [2019-10-17 11:46:01] resolve_keys
[START] [2019-10-17 11:46:01] hold_for_later_1
[STOP] [2019-10-17 11:46:01] hold_for_later_1
[START] [2019-10-17 11:46:01] hold_for_later_2
[STOP] [2019-10-17 11:46:01] hold_for_later_2
[START] [2019-10-17 11:46:01] resolve_missing_parents
[STOP] [2019-10-17 11:46:25] resolve_missing_parents
[START] [2019-10-17 11:46:25] rebuild_nodes
[START] [2019-10-17 11:46:25] Flattener#flatten
[START] [2019-10-17 11:46:25] Flattener#study_resource
[START] [2019-10-17 11:46:25] Flattener#build_ancestry
[STOP] [2019-10-17 11:46:27] Flattener#build_ancestry
[INFO] [2019-10-17 11:46:27] 11749 ancestry keys
[START] [2019-10-17 11:46:27] build_node_ancestors
[INFO] [2019-10-17 11:46:27] old ancestors deleted.
[STOP] [2019-10-17 11:46:29] build_node_ancestors
[START] [2019-10-17 11:46:31] Flattener#propagate_ancestor_ids
[STOP] [2019-10-17 11:46:31] Flattener#propagate_ancestor_ids
[STOP] [2019-10-17 11:46:31] Flattener#flatten
[STOP] [2019-10-17 11:46:31] rebuild_nodes
[START] [2019-10-17 11:46:31] resolve_missing_media_owners
[STOP] [2019-10-17 11:46:31] resolve_missing_media_owners
[START] [2019-10-17 11:46:31] sanitize_media_verbatims
[STOP] [2019-10-17 11:46:31] sanitize_media_verbatims
[START] [2019-10-17 11:46:31] queue_downloads
[STOP] [2019-10-17 11:46:31] queue_downloads
[START] [2019-10-17 11:46:31] parse_names
[WARN] [2019-10-17 11:46:31] I see 11749 names which still need to be parsed.
[STOP] [2019-10-17 11:46:41] parse_names
[START] [2019-10-17 11:46:41] denormalize_canonical_names_to_nodes
[STOP] [2019-10-17 11:46:41] denormalize_canonical_names_to_nodes
[START] [2019-10-17 11:46:41] match_nodes
[START] [2019-10-17 11:46:41] map_all_nodes_to_pages
[STOP] [2019-10-17 11:53:23] map_all_nodes_to_pages
[INFO] [2019-10-17 11:53:23] 694 Unmatched nodes (of 11749)! That's too many to output. First 10: Euplectes macrourus (#52737072); Hirundo abyssinica (#52736563); Hirundo semirufa (#52736575); Hirundo cucullata (#52736843); Hirundo fuligula (#52738424); Hirundo rufigula (#52747205); Hirundo senegalensis (#52747219); Delichon urbica (#52739299); Spermestes cucullatus (#52735988); Spermestes bicolor (#52736067)
[START] [2019-10-17 11:53:23] update_nodes
[STOP] [2019-10-17 11:53:27] update_nodes
[STOP] [2019-10-17 11:53:27] match_nodes
[START] [2019-10-17 11:53:27] reindex_search
[STOP] [2019-10-17 11:53:46] reindex_search
[START] [2019-10-17 11:53:46] normalize_units
[STOP] [2019-10-17 11:53:46] normalize_units
[START] [2019-10-17 11:53:46] calculate_statistics
[STOP] [2019-10-17 11:53:46] calculate_statistics
[START] [2019-10-17 11:53:46] complete_harvest_instance
[START] [2019-10-17 11:53:46] overall_tsv_creation
[INFO] [2019-10-17 11:53:46] Processing group of 11749 in 2 batches of 10000
[INFO] [2019-10-17 11:55:15] 6290 Traits (unfiltered)...
[INFO] [2019-10-17 11:55:28] 6290 Traits (filtered)...
[INFO] [2019-10-17 11:55:28] 0 Associations (filtered)...
[INFO] [2019-10-17 11:56:17] 31448 metadata added.
[INFO] [2019-10-17 11:56:17] 0 metadata added.
[INFO] [2019-10-17 11:57:06] 940 Traits (unfiltered)...
[INFO] [2019-10-17 11:57:19] 940 Traits (filtered)...
[INFO] [2019-10-17 11:57:19] 0 Associations (filtered)...
[INFO] [2019-10-17 11:57:57] 4698 metadata added.
[INFO] [2019-10-17 11:57:57] 0 metadata added.
[INFO] [2019-10-17 11:57:57] Average Time: 99.335
[INFO] [2019-10-17 11:57:57] Total Time: 4m11s
[STOP] [2019-10-17 11:57:57] overall_tsv_creation
[INFO] [2019-10-17 11:57:57] Done. Check your files:
[INFO] [2019-10-17 11:57:57] (11749 lines) /app/public/data/zimbabwe_sp_list/publish_nodes.tsv
[INFO] [2019-10-17 11:57:57] (29587 lines) /app/public/data/zimbabwe_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-17 11:57:57] (11749 lines) /app/public/data/zimbabwe_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-17 11:57:57] (7231 lines) /app/public/data/zimbabwe_sp_list/publish_traits.tsv
[INFO] [2019-10-17 11:57:57] (36147 lines) /app/public/data/zimbabwe_sp_list/publish_metadata.tsv
[STOP] [2019-10-17 11:57:57] complete_harvest_instance
[START] [2019-10-17 11:57:57] completed
[STOP] [2019-10-17 11:57:57] completed
[STOP] [2019-10-17 11:57:57] logged process, took 849.96

Latest Process