Harvest for Australia Species List Created 10 Oct 12:15

Stage: completed
Fetched: 10 Oct 12:15
Validated: 10 Oct 12:15
Deltas Created 10 Oct 12:15
Units Normalized: 10 Oct 14:37
Ancestry Built: 10 Oct 12:51
Nodes Matched: 10 Oct 14:31
Names Parsed: 10 Oct 12:53
New Models Stored: 10 Oct 12:39
Indexed: 10 Oct 14:37
Completed: 10 Oct 15:23
Time to Harvest: 3 minutes

Harvesting Log

(233 lines)
# Logfile created on 2019-10-10 12:15:33 -0400 by logger.rb/56815
[START] [2019-10-10 12:15:33] logged process
[START] [2019-10-10 12:15:33] create_harvest_instance
[STOP] [2019-10-10 12:15:34] create_harvest_instance
[START] [2019-10-10 12:15:34] fetch_files
[STOP] [2019-10-10 12:15:34] fetch_files
[START] [2019-10-10 12:15:34] validate_each_file
[STOP] [2019-10-10 12:15:55] validate_each_file
[START] [2019-10-10 12:15:55] convert_to_csv
[CMD] [2019-10-10 12:15:55] /usr/bin/sort /app/public/converted_csv/aus_sp_list_refs_15106.csv > /app/public/converted_csv/aus_sp_list_refs_15106.csv_sorted
[CMD] [2019-10-10 12:15:55] /usr/bin/sort /app/public/converted_csv/aus_sp_list_nodes_15107.csv > /app/public/converted_csv/aus_sp_list_nodes_15107.csv_sorted
[CMD] [2019-10-10 12:15:55] /usr/bin/sort /app/public/converted_csv/aus_sp_list_occurrences_15108.csv > /app/public/converted_csv/aus_sp_list_occurrences_15108.csv_sorted
[CMD] [2019-10-10 12:15:55] /usr/bin/sort /app/public/converted_csv/aus_sp_list_measurements_15109.csv > /app/public/converted_csv/aus_sp_list_measurements_15109.csv_sorted
[STOP] [2019-10-10 12:15:56] convert_to_csv
[START] [2019-10-10 12:15:56] calculate_delta
[CMD] [2019-10-10 12:15:56] echo "0a" > /app/public/diff/aus_sp_list_refs_15106.diff
[CMD] [2019-10-10 12:15:56] tail -n +1 /app/public/converted_csv/aus_sp_list_refs_15106.csv >> /app/public/diff/aus_sp_list_refs_15106.diff
[CMD] [2019-10-10 12:15:56] echo "." >> /app/public/diff/aus_sp_list_refs_15106.diff
[CMD] [2019-10-10 12:15:56] echo "0a" > /app/public/diff/aus_sp_list_nodes_15107.diff
[CMD] [2019-10-10 12:15:56] tail -n +1 /app/public/converted_csv/aus_sp_list_nodes_15107.csv >> /app/public/diff/aus_sp_list_nodes_15107.diff
[CMD] [2019-10-10 12:15:56] echo "." >> /app/public/diff/aus_sp_list_nodes_15107.diff
[CMD] [2019-10-10 12:15:56] echo "0a" > /app/public/diff/aus_sp_list_occurrences_15108.diff
[CMD] [2019-10-10 12:15:56] tail -n +1 /app/public/converted_csv/aus_sp_list_occurrences_15108.csv >> /app/public/diff/aus_sp_list_occurrences_15108.diff
[CMD] [2019-10-10 12:15:56] echo "." >> /app/public/diff/aus_sp_list_occurrences_15108.diff
[CMD] [2019-10-10 12:15:56] echo "0a" > /app/public/diff/aus_sp_list_measurements_15109.diff
[CMD] [2019-10-10 12:15:56] tail -n +1 /app/public/converted_csv/aus_sp_list_measurements_15109.csv >> /app/public/diff/aus_sp_list_measurements_15109.diff
[CMD] [2019-10-10 12:15:56] echo "." >> /app/public/diff/aus_sp_list_measurements_15109.diff
[STOP] [2019-10-10 12:15:56] calculate_delta
[START] [2019-10-10 12:15:56] parse_diff_and_store
[INFO] [2019-10-10 12:15:56] Loading refs diff file into memory (true lines)...
[INFO] [2019-10-10 12:15:56] Loading nodes diff file into memory (true lines)...
[INFO] [2019-10-10 12:17:02] Loading occurrences diff file into memory (true lines)...
[INFO] [2019-10-10 12:17:21] Loading measurements diff file into memory (true lines)...
[INFO] [2019-10-10 12:31:11] Storing 2 References
[INFO] [2019-10-10 12:31:11] Processing group of 2 in 1 groups of 1000
[INFO] [2019-10-10 12:31:11] Average Time: 0.0
[INFO] [2019-10-10 12:31:11] Total Time: 1s
[INFO] [2019-10-10 12:31:11] Storing 163702 ScientificNames
[INFO] [2019-10-10 12:31:11] Processing group of 163702 in 164 groups of 1000
[INFO] [2019-10-10 12:32:53] Average Time: 0.622
[INFO] [2019-10-10 12:32:53] Total Time: 1m43s
[INFO] [2019-10-10 12:32:53] last 3 / first 3: 0.71
[INFO] [2019-10-10 12:32:53] Std.Dev: 1.104536101718726; Max: 6.12
[INFO] [2019-10-10 12:32:53] Storing 163702 Nodes
[INFO] [2019-10-10 12:32:53] Processing group of 163702 in 164 groups of 1000
[INFO] [2019-10-10 12:34:26] Average Time: 0.56
[INFO] [2019-10-10 12:34:26] Total Time: 1m33s
[INFO] [2019-10-10 12:34:26] last 3 / first 3: 0.88
[INFO] [2019-10-10 12:34:26] Std.Dev: 1.1674759098157015; Max: 6.7
[INFO] [2019-10-10 12:34:26] Storing 128759 Occurrences
[INFO] [2019-10-10 12:34:26] Processing group of 128759 in 129 groups of 1000
[INFO] [2019-10-10 12:34:56] Average Time: 0.228
[INFO] [2019-10-10 12:34:56] Total Time: 30s
[INFO] [2019-10-10 12:34:56] last 3 / first 3: 0.58
[INFO] [2019-10-10 12:34:56] Std.Dev: 0.8228000972289685; Max: 6.78
[INFO] [2019-10-10 12:34:56] Storing 257518 TraitsReferences
[INFO] [2019-10-10 12:34:56] Processing group of 257518 in 258 groups of 1000
[INFO] [2019-10-10 12:35:23] Average Time: 0.102
[INFO] [2019-10-10 12:35:23] Total Time: 27s
[INFO] [2019-10-10 12:35:23] last 3 / first 3: 0.53
[INFO] [2019-10-10 12:35:23] Std.Dev: 0.39496835316262996; Max: 6.39
[INFO] [2019-10-10 12:35:23] Storing 257518 Traits
[INFO] [2019-10-10 12:35:23] Processing group of 257518 in 258 groups of 1000
[INFO] [2019-10-10 12:37:48] Average Time: 0.558
[INFO] [2019-10-10 12:37:48] Total Time: 2m26s
[INFO] [2019-10-10 12:37:48] last 3 / first 3: 0.62
[INFO] [2019-10-10 12:37:48] Std.Dev: 1.2934450123604018; Max: 8.11
[INFO] [2019-10-10 12:37:48] Storing 256978 MetaTraits
[INFO] [2019-10-10 12:37:48] Processing group of 256978 in 257 groups of 1000
[INFO] [2019-10-10 12:39:02] Average Time: 0.283
[INFO] [2019-10-10 12:39:02] Total Time: 1m14s
[INFO] [2019-10-10 12:39:02] last 3 / first 3: 0.83
[INFO] [2019-10-10 12:39:02] Std.Dev: 1.1291589790636214; Max: 8.49
[STOP] [2019-10-10 12:39:02] parse_diff_and_store
[START] [2019-10-10 12:39:02] resolve_keys
[INFO] [2019-10-10 12:42:29] Occurrences to nodes (through scientific_names)...
[INFO] [2019-10-10 12:42:41] traits to occurrences...
[INFO] [2019-10-10 12:42:56] traits to nodes (through occurrences)...
[INFO] [2019-10-10 12:42:58] Traits to sex term...
[INFO] [2019-10-10 12:43:07] Traits to lifestage term...
[INFO] [2019-10-10 12:43:17] MetaTraits to traits...
[INFO] [2019-10-10 12:43:32] MetaTraits (simple, measurement row refers to parent) to traits...
[INFO] [2019-10-10 12:44:09] Assocs to occurrences...
[INFO] [2019-10-10 12:44:09] Assocs to nodes...
[INFO] [2019-10-10 12:44:09] Assoc to sex term...
[INFO] [2019-10-10 12:44:09] Assoc to lifestage term...
[STOP] [2019-10-10 12:44:09] resolve_keys
[START] [2019-10-10 12:44:09] hold_for_later_1
[STOP] [2019-10-10 12:44:09] hold_for_later_1
[START] [2019-10-10 12:44:09] hold_for_later_2
[STOP] [2019-10-10 12:44:09] hold_for_later_2
[START] [2019-10-10 12:44:09] resolve_missing_parents
[STOP] [2019-10-10 12:46:56] resolve_missing_parents
[START] [2019-10-10 12:46:56] rebuild_nodes
[START] [2019-10-10 12:46:56] Flattener#flatten
[START] [2019-10-10 12:46:56] Flattener#study_resource
[START] [2019-10-10 12:47:06] Flattener#build_ancestry
[STOP] [2019-10-10 12:48:02] Flattener#build_ancestry
[INFO] [2019-10-10 12:48:02] 163702 ancestry keys
[START] [2019-10-10 12:48:02] build_node_ancestors
[INFO] [2019-10-10 12:48:02] old ancestors deleted.
[STOP] [2019-10-10 12:50:23] build_node_ancestors
[START] [2019-10-10 12:50:26] Flattener#propagate_ancestor_ids
[STOP] [2019-10-10 12:51:12] Flattener#propagate_ancestor_ids
[STOP] [2019-10-10 12:51:12] Flattener#flatten
[STOP] [2019-10-10 12:51:12] rebuild_nodes
[START] [2019-10-10 12:51:12] resolve_missing_media_owners
[STOP] [2019-10-10 12:51:12] resolve_missing_media_owners
[START] [2019-10-10 12:51:12] sanitize_media_verbatims
[STOP] [2019-10-10 12:51:12] sanitize_media_verbatims
[START] [2019-10-10 12:51:12] queue_downloads
[STOP] [2019-10-10 12:51:12] queue_downloads
[START] [2019-10-10 12:51:12] parse_names
[WARN] [2019-10-10 12:51:12] I see 163702 names which still need to be parsed.
[STOP] [2019-10-10 12:53:19] parse_names
[START] [2019-10-10 12:53:19] denormalize_canonical_names_to_nodes
[STOP] [2019-10-10 12:53:21] denormalize_canonical_names_to_nodes
[START] [2019-10-10 12:53:21] match_nodes
[START] [2019-10-10 12:53:21] map_all_nodes_to_pages
[STOP] [2019-10-10 14:31:31] map_all_nodes_to_pages
[INFO] [2019-10-10 14:31:31] 16409 Unmatched nodes (of 163702)! That's too many to output. First 10: Rhipidura albicauda (#48625203); Zosterops halmaturina (#48695208); Zosterops citrinellus (#48705152); Zosterops bowiae (#48739009); Malurus leuconotus (#48666274); Malurus cyanotus (#48699925); Magnamytis (#48747782); Magnamytis woodwardi (#48747781); Lichenostomus penicillatus (#48585675); Lichenostomus virescens (#48585787)
[START] [2019-10-10 14:31:31] update_nodes
[STOP] [2019-10-10 14:31:54] update_nodes
[STOP] [2019-10-10 14:31:54] match_nodes
[START] [2019-10-10 14:31:54] reindex_search
[STOP] [2019-10-10 14:37:03] reindex_search
[START] [2019-10-10 14:37:03] normalize_units
[STOP] [2019-10-10 14:37:04] normalize_units
[START] [2019-10-10 14:37:04] calculate_statistics
[STOP] [2019-10-10 14:37:04] calculate_statistics
[START] [2019-10-10 14:37:04] complete_harvest_instance
[START] [2019-10-10 14:37:04] overall_tsv_creation
[INFO] [2019-10-10 14:37:04] Processing group of 163702 in 17 batches of 10000
[INFO] [2019-10-10 14:38:36] 6519 Traits (unfiltered)...
[INFO] [2019-10-10 14:38:50] 6519 Traits (filtered)...
[INFO] [2019-10-10 14:38:50] 0 Associations (filtered)...
[INFO] [2019-10-10 14:39:41] 32593 metadata added.
[INFO] [2019-10-10 14:39:41] 0 metadata added.
[INFO] [2019-10-10 14:41:14] 7241 Traits (unfiltered)...
[INFO] [2019-10-10 14:41:28] 7241 Traits (filtered)...
[INFO] [2019-10-10 14:41:29] 0 Associations (filtered)...
[INFO] [2019-10-10 14:42:26] 36194 metadata added.
[INFO] [2019-10-10 14:42:26] 0 metadata added.
[INFO] [2019-10-10 14:44:02] 7531 Traits (unfiltered)...
[INFO] [2019-10-10 14:44:16] 7531 Traits (filtered)...
[INFO] [2019-10-10 14:44:16] 0 Associations (filtered)...
[INFO] [2019-10-10 14:45:12] 37641 metadata added.
[INFO] [2019-10-10 14:45:12] 0 metadata added.
[INFO] [2019-10-10 14:46:48] 7722 Traits (unfiltered)...
[INFO] [2019-10-10 14:47:02] 7722 Traits (filtered)...
[INFO] [2019-10-10 14:47:02] 0 Associations (filtered)...
[INFO] [2019-10-10 14:47:57] 38587 metadata added.
[INFO] [2019-10-10 14:47:57] 0 metadata added.
[INFO] [2019-10-10 14:49:37] 7810 Traits (unfiltered)...
[INFO] [2019-10-10 14:49:51] 7810 Traits (filtered)...
[INFO] [2019-10-10 14:49:52] 0 Associations (filtered)...
[INFO] [2019-10-10 14:50:49] 39031 metadata added.
[INFO] [2019-10-10 14:50:49] 0 metadata added.
[INFO] [2019-10-10 14:52:25] 7763 Traits (unfiltered)...
[INFO] [2019-10-10 14:52:39] 7763 Traits (filtered)...
[INFO] [2019-10-10 14:52:39] 0 Associations (filtered)...
[INFO] [2019-10-10 14:53:39] 38794 metadata added.
[INFO] [2019-10-10 14:53:39] 0 metadata added.
[INFO] [2019-10-10 14:55:16] 7903 Traits (unfiltered)...
[INFO] [2019-10-10 14:55:30] 7903 Traits (filtered)...
[INFO] [2019-10-10 14:55:30] 0 Associations (filtered)...
[INFO] [2019-10-10 14:56:27] 39486 metadata added.
[INFO] [2019-10-10 14:56:27] 0 metadata added.
[INFO] [2019-10-10 14:58:04] 7839 Traits (unfiltered)...
[INFO] [2019-10-10 14:58:18] 7839 Traits (filtered)...
[INFO] [2019-10-10 14:58:18] 0 Associations (filtered)...
[INFO] [2019-10-10 14:59:15] 39168 metadata added.
[INFO] [2019-10-10 14:59:15] 0 metadata added.
[INFO] [2019-10-10 15:00:51] 7830 Traits (unfiltered)...
[INFO] [2019-10-10 15:01:05] 7830 Traits (filtered)...
[INFO] [2019-10-10 15:01:05] 0 Associations (filtered)...
[INFO] [2019-10-10 15:02:01] 39118 metadata added.
[INFO] [2019-10-10 15:02:01] 0 metadata added.
[INFO] [2019-10-10 15:03:38] 7906 Traits (unfiltered)...
[INFO] [2019-10-10 15:03:52] 7906 Traits (filtered)...
[INFO] [2019-10-10 15:03:52] 0 Associations (filtered)...
[INFO] [2019-10-10 15:04:49] 39474 metadata added.
[INFO] [2019-10-10 15:04:49] 0 metadata added.
[INFO] [2019-10-10 15:06:25] 8048 Traits (unfiltered)...
[INFO] [2019-10-10 15:06:39] 8048 Traits (filtered)...
[INFO] [2019-10-10 15:06:39] 0 Associations (filtered)...
[INFO] [2019-10-10 15:07:36] 40190 metadata added.
[INFO] [2019-10-10 15:07:36] 0 metadata added.
[INFO] [2019-10-10 15:09:11] 8197 Traits (unfiltered)...
[INFO] [2019-10-10 15:09:26] 8197 Traits (filtered)...
[INFO] [2019-10-10 15:09:26] 0 Associations (filtered)...
[INFO] [2019-10-10 15:10:23] 40950 metadata added.
[INFO] [2019-10-10 15:10:23] 0 metadata added.
[INFO] [2019-10-10 15:11:59] 8179 Traits (unfiltered)...
[INFO] [2019-10-10 15:12:13] 8179 Traits (filtered)...
[INFO] [2019-10-10 15:12:13] 0 Associations (filtered)...
[INFO] [2019-10-10 15:13:10] 40840 metadata added.
[INFO] [2019-10-10 15:13:10] 0 metadata added.
[INFO] [2019-10-10 15:14:46] 8313 Traits (unfiltered)...
[INFO] [2019-10-10 15:15:00] 8313 Traits (filtered)...
[INFO] [2019-10-10 15:15:00] 0 Associations (filtered)...
[INFO] [2019-10-10 15:15:57] 41519 metadata added.
[INFO] [2019-10-10 15:15:57] 0 metadata added.
[INFO] [2019-10-10 15:17:37] 8356 Traits (unfiltered)...
[INFO] [2019-10-10 15:17:51] 8356 Traits (filtered)...
[INFO] [2019-10-10 15:17:51] 0 Associations (filtered)...
[INFO] [2019-10-10 15:18:48] 41736 metadata added.
[INFO] [2019-10-10 15:18:48] 0 metadata added.
[INFO] [2019-10-10 15:20:24] 8420 Traits (unfiltered)...
[INFO] [2019-10-10 15:20:38] 8420 Traits (filtered)...
[INFO] [2019-10-10 15:20:38] 0 Associations (filtered)...
[INFO] [2019-10-10 15:21:36] 42049 metadata added.
[INFO] [2019-10-10 15:21:36] 0 metadata added.
[INFO] [2019-10-10 15:22:40] 3182 Traits (unfiltered)...
[INFO] [2019-10-10 15:22:54] 3182 Traits (filtered)...
[INFO] [2019-10-10 15:22:54] 0 Associations (filtered)...
[INFO] [2019-10-10 15:23:37] 15885 metadata added.
[INFO] [2019-10-10 15:23:37] 0 metadata added.
[INFO] [2019-10-10 15:23:37] Average Time: 137.155
[INFO] [2019-10-10 15:23:37] Total Time: 46m33s
[INFO] [2019-10-10 15:23:37] last 3 / first 3: 0.94
[INFO] [2019-10-10 15:23:37] Std.Dev: 10.506521784111047; Max: 143.81
[STOP] [2019-10-10 15:23:37] overall_tsv_creation
[INFO] [2019-10-10 15:23:37] Done. Check your files:
[INFO] [2019-10-10 15:23:37] (163702 lines) /app/public/data/aus_sp_list/publish_nodes.tsv
[INFO] [2019-10-10 15:23:37] (928240 lines) /app/public/data/aus_sp_list/publish_node_ancestors.tsv
[INFO] [2019-10-10 15:23:37] (163702 lines) /app/public/data/aus_sp_list/publish_scientific_names.tsv
[INFO] [2019-10-10 15:23:37] (128760 lines) /app/public/data/aus_sp_list/publish_traits.tsv
[INFO] [2019-10-10 15:23:38] (643256 lines) /app/public/data/aus_sp_list/publish_metadata.tsv
[STOP] [2019-10-10 15:23:38] complete_harvest_instance
[START] [2019-10-10 15:23:38] completed
[STOP] [2019-10-10 15:23:38] completed
[STOP] [2019-10-10 15:23:38] logged process, took 11284.61

Latest Process